kubeadm deploys k8s and high availability

Table of Contents

CNI network components

1. Function of flannel

2. Three modes of flannel

3. Working principle of flannel’s UDP mode

4. Working principle of flannel’s VXLAN mode

5. Main components of Calico

6. Working principle of calico’s IPIP mode

7. Working principle of calico’s BGP mode

8. The difference between flannel and calico

Kubeadm deploys k8s and high availability

1. Environment preparation

2. Install docker on all nodes

3. Install kubeadm, kubelet and kubectl on all nodes

4. All master hosts deploy haproxy and keepalived to achieve high availability.

5. Deploy K8S cluster

6. Deploy network plug-in flannel


Continue from previous article

CNI Network Component

Network deployment plug-ins include fannel, calico, cilium, etc.

Mainly introduce flannel and calico

flannel solution

It is necessary to encapsulate the data packets sent to the container on each node, and then use the tunnel to send the encapsulated data packets to the node running the target Pod. The target node is then responsible for removing the encapsulation and sending the de-encapsulated data packet to the target Pod. Data communication performance is greatly affected.

calico solution

Calico does not use tunnels or NAT to implement forwarding. Instead, it treats the Host as a router in the Internet, uses BGP to synchronize routing, and uses iptables to implement security access policies to complete cross-Host forwarding.

1. Function of flannel

Let Docker containers created by different node hosts in the cluster have virtual IP addresses that are unique to the entire cluster.

2. Three modes of flannel

  • UDP is the earliest mode, but its performance is poor. It implements data packet encapsulation/decapsulation based on the flanneld application.
  • VXLAN default mode is the recommended mode. Its performance is better than UDP mode. It realizes encapsulation/decapsulation of data frames based on the kernel. It is simple to configure and easy to use.
  • HOST-GW has the best performance, but the configuration is complex and cannot span network segments.

3. Working principle of flannel’s UDP mode

1) The original data packet is forwarded from the Pod container of the source host to the flannel0 virtual interface through the cni0 bridge

2) The flanneld process is responsible for monitoring the data received by the flannel0 interface and encapsulating the source data packets into UDP messages.

3) Then the flanneld process finds the nodeIP of the node where the target Pod is located by querying the routing table maintained in etcd, then encapsulates the nodeIP outside the UDP message, and finally sends it to the target node through the physical network card.

4) The data packet is sent to the flanneld process of the target node through port 8285 for decapsulation, and then forwarded to the target Pod container by the flannel0 virtual interface through the cni0 bridge.

4. Working principle of flannel’s VXLAN mode

1) The original data frame is forwarded from the Pod container of the source host to the flannel.1 virtual interface through the cni0 bridge interface.

2) After receiving the data frame, the flannel.1 virtual interface adds the VXLAN header and encapsulates the original data frame into the UDP message in the kernel.

3) The flanneld process is sent to the node where the target Pod is located through the physical network card according to the routing table maintained in etcd.

4) The data packet is sent to the flannel.1 virtual interface of the target node through port 8472, and then sent to the target Pod container through the cni0 bridge through the flannel.1 virtual interface.

5. Main components of Calico

  • Felix: Responsible for maintaining routing rules, FIB forwarding information database, etc. on the host.
  • BIRD: Responsible for distributing routing rules, similar to a router.
  • Confd: configuration management component.

Calico CNI plug-in: Mainly responsible for docking with kubernetes for kubelet calling.

6. Working principle of calico’s IPIP mode

1) The Pod container of the original data packet source host is sent to the tunl0 interface, and then the IPIP driver of the kernel is encapsulated into the IP packet of the node network

2) The route of the tunl0 interface is then sent to the target node through the physical network card.

3) After the IP data message reaches the target node, it is unpacked through the kernel’s IPIP driver to obtain the original IP data packet.

4) Finally, it is delivered to the target Pod container through the veth pair device according to the local routing rules.

7. Working principle of calico’s BGP mode

The essence is to maintain communication between Pods through routing rules. Felix is responsible for maintaining routing rules and network interface management, and BIRD is responsible for distributing routing information to other nodes.

1) The original IP packet sent by the Pod container of the source host will be sent to the node network space through the veth pair device

2) Then the IP of the target node will be found based on the target IP in the original IP packet and the routing rules of the node, and then sent to the target node through the physical network card.

3) After the IP data packet reaches the target node, it will be sent to the target Pod container through the veth pair device according to the local routing rules.

8. The difference between flannel and calico

1) The network modes of flannel include UDP, VXLAN, and HOST-GW

2) calico’s network modes include IPIP, BGP, and mixed mode

3) The default network segment of flannel is 10.244.0.0/16, while the default network segment of calico is 192.168.0.0/16

4) flannel is suitable for small architectures and situations with uncomplex structures, while calico is suitable for large structures and situations with complex structures.

Kubeadm deploys k8s and high availability

Deployment environment
master01 192.168.3.100 docker, kubeadm, kubelet, kubectl, haproxy and keepalived
master02 192.168.3.101 docker, kubeadm, kubelet, kubectl, haproxy and keepalived
master03 192.168.3.102 docker, kubeadm, kubelet, kubectl, haproxy and keepalived
node01 192.168.3.103 docker, kubeadm, kubelet and kubectl
node02 192.168.3.104 docker , kubeadm, kubelet and kubectl

1. Environment preparation

//All nodes, turn off firewall rules, turn off selinux, turn off swap exchange
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
sed -i 's/enforcing/disabled/' /etc/selinux/config
iptables -F & amp; & amp; iptables -t nat -F & amp; & amp; iptables -t mangle -F & amp; & amp; iptables -X
swapoff -a
sed -ri 's/.*swap.*/# & amp;/' /etc/fstab

//Modify the host names separately
hostnamectl set-hostname master01
hostnamectl set-hostname master02
hostnamectl set-hostname master03
hostnamectl set-hostname node01
hostnamectl set-hostname node02

//Modify hosts files on all nodes
vim /etc/hosts
192.168.3.100 master01
192.168.3.101 master02
192.168.3.102 master03
192.168.3.103 node01
192.168.3.104 node02

//All node time synchronization
yum -y install ntpdate
ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
echo 'Asia/Shanghai' >/etc/timezone
ntpdate time2.aliyun.com
//Set up periodic tasks for time synchronization
systemctl enable --now crond
crontab -e
*/30 * * * * /usr/sbin/ntpdate time2.aliyun.com

//All nodes implement Linux resource limits
vim /etc/security/limits.conf
* soft nofile 65536
*hard nofile 131072
* soft nproc 65535
*hard nproc 655350
* soft memlock unlimited
* hard memlock unlimited

//Adjust kernel parameters
cat > /etc/sysctl.d/k8s.conf <<EOF
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
fs.may_detach_mounts = 1
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_watches=89100
fs.file-max=52706963
fs.nr_open=52706963
net.netfilter.nf_conntrack_max=2310720

net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_intvl =15
net.ipv4.tcp_max_tw_buckets = 36000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_max_orphans = 327680
net.ipv4.tcp_orphan_retries = 3
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.ip_conntrack_max = 65536
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.tcp_timestamps = 0
net.core.somaxconn = 16384
EOF

#Effective parameters
sysctl --system 

//Load ip_vs module
for i in $(ls /usr/lib/modules/$(uname -r)/kernel/net/netfilter/ipvs|grep -o "^[^.]*");do echo $i; /sbin /modinfo -F filename $i >/dev/null 2> & amp;1 & amp; & amp; /sbin/modprobe $i;done

2. Install docker on all nodes

//Download dependent environment
yum install -y yum-utils device-mapper-persistent-data lvm2
//Download docker's repo source
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
//Download docker
yum install -y docker-ce docker-ce-cli containerd.io

//Set docker to accelerate image pulling
cat > /etc/docker/daemon.json <<EOF
{
  "registry-mirrors": ["https://6na95ym4.mirror.aliyuncs.com"],
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "500m", "max-file": "3"
  }
}
EOF
//Reload systemd management files
systemctl daemon-reload
//Set up the self-starting docker service and start docker immediately
systemctl enable --now docker.service 

3. Install kubeadm, kubelet and kubectl on all nodes

//Define kubernetes source
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
//download
yum install -y kubelet-1.20.15 kubeadm-1.20.15 kubectl-1.20.15

//Configure Kubelet to use Alibaba Cloud’s pause image
cat > /etc/sysconfig/kubelet <<EOF
KUBELET_EXTRA_ARGS="--cgroup-driver=systemd --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause-amd64:3.2"
EOF

//Start kubelet automatically after booting
systemctl enable --now kubelet
//At this point, you will find that kubelet fails to start.
systemctl status kubelet

4. All master hosts deploy haproxy and keepalived to achieve high availability

//Deploy Haproxy on all master nodes
yum -y install haproxy keepalived

cat > /etc/haproxy/haproxy.cfg << EOF
global
    log 127.0.0.1 local0 info
    log 127.0.0.1 local1 warning
    chroot /var/lib/haproxy
    pidfile /var/run/haproxy.pid
    maxconn 4000
    user haproxy
    group haproxy
    daemon
    stats socket /var/lib/haproxy/stats

defaults
    mode tcp
    log global
    option tcplog
    optiondontlognull
    option redispatch
    retries 3
    timeout queue 1m
    timeout connect 10s
    timeout client 1m
    timeout server 1m
    timeout check 10s
    maxconn 3000

frontend monitor-in
    bind *:33305
    mode http
    option httplog
    monitor-uri /monitor

frontend k8s-master
    bind *:6444
    mode tcp
    option tcplog
    default_backend k8s-master

backend k8s-master
    mode tcp
    option tcplog
    option tcp-check
    balance roundrobin
    server k8s-master1 192.168.3.100:6443 check inter 10000 fall 2 rise 2 weight 1
    server k8s-master2 192.168.3.101:6443 check inter 10000 fall 2 rise 2 weight 1
    server k8s-master3 192.168.3.102:6443 check inter 10000 fall 2 rise 2 weight 1
EOF

//All master nodes deploy keepalived
yum -y install keepalived

cd /etc/keepalived/

vim keepalived.conf
! Configuration File for keepalived
global_defs {
    router_id LVS_HA1 #Routing identifier, each node configuration is different
}

vrrp_script chk_haproxy {
    script "/etc/keepalived/check_haproxy.sh"
    interval 2
    weight 2
}

vrrp_instance VI_1 {
    state MASTER #Set MASTER on master01, set BACKUP on master02 and master03
    interface ens33
    virtual_router_id 51
    priority 100 #The initial weight of the local machine, the backup machine setting is smaller than the value of the main machine
    advert_int 1
    virtual_ipaddress {
        192.168.3.254 #Set VIP address
    }
    track_script {
        chk_haproxy
    }
}

//Write haproxy health check script
vim check_haproxy.sh
#!/bin/bash
if ! killall -0 haproxy; then
    systemctl stop keepalived
fi

//And give the script execution permissions
chmod +x chk_haproxy.sh

//Start haproxy and keepalived at boot
systemctl enable --now haproxy
systemctl enable --now keepalived

5. Deploy K8S cluster

//Set the cluster initialization configuration file on the master01 node
kubeadm config print init-defaults > /opt/kubeadm-config.yaml

cd /opt/
vim kubeadm-config.yaml
...
11 localAPIEndpoint:
12 advertiseAddress: 192.168.3.100 #Specify the IP address of the current master node
13 bindPort: 6443

21 apiServer:
22 certSANs: #Add a list of certsSANs under the apiServer attribute, add the IP addresses of all master nodes and the cluster VIP address
23 - 192.168.3.254
24 - 192.168.3.100
25 - 192.168.3.101
26 - 192.168.3.102

30 clusterName: kubernetes
31 controlPlaneEndpoint: "192.168.3.100:6444" #Specify the cluster VIP address
32 controllerManager: {}

38 imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers #Specify the image download address
39 kind: ClusterConfiguration
40 kubernetesVersion: v1.20.15 #Specify kubernetes version number
41 networking:
42 dnsDomain: cluster.local
43 podSubnet: "10.244.0.0/16" #Specify the pod network segment, 10.244.0.0/16 is used to match the flannel default network segment
44 serviceSubnet: 10.96.0.0/16 #Specify service network segment
45 scheduler: {}
#Add the following content at the end
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs #Change the default kube-proxy scheduling method to ipvs mode

#Update cluster initialization configuration file
kubeadm config migrate --old-config kubeadm-config.yaml --new-config new.yaml

#Copy the yaml configuration file to other hosts and pull the image through the configuration file
for i in master02 master03 node01 node02; do scp /opt/new.yaml $i:/opt/; done

//Pull the image
kubeadm config images pull --config /opt/new.yaml

//master01 node is initialized
kubeadm init --config new.yaml --upload-certs | tee kubeadm-init.log

//master01 node for environment configuration
#Configure kubectl
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config

//Then restart the kubelet service
systemctl restart kubelet

//All nodes join the cluster
#master node joins the cluster
kubeadm join 192.168.3.254:6444 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:e1434974e3b947739e650c13b94f9a2e864f6c444b9a6e891efb4d8e1c4a05b7 \
    --control-plane --certificate-key fff2215a35a1b54f9b39882a36644b19300b7053429c43a1a713e4ed791076c4

//Then each master node will also prompt what needs to be done next.
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

//node node joins the cluster
kubeadm join 192.168.3.254:6444 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:e1434974e3b947739e650c13b94f9a2e864f6c444b9a6e891efb4d8e1c4a05b7

//View cluster information on master01
kubectl get nodes

6. Deploy network plug-in flannel

All nodes upload the flannel image flannel.tar and network plug-in cni-plugins-linux-amd64-v0.8.6.tgz to the /opt directory, and the master node uploads the kube-flannel.yml file
cd /opt
docker load < flannel.tar

mv /opt/cni /opt/cni_bak
mkdir -p /opt/cni/bin
tar zxvf cni-plugins-linux-amd64-v0.8.6.tgz -C /opt/cni/bin

kubectl apply -f kube-flannel.yml 

View cluster information again

//View cluster information on master01
kubectl get nodes

Final verification

kubectl get pod -n kube-system