In this blog we introduce the use of Kube-OVN’s Subnet. If you haven’t set up a Kubernetes + Kube-OVN environment yet, you can refer to my previous article: Use Kubeadm to build a Kubernetes cluster.
Initial VPC and Subnet
Kube-OVN is ready to use out of the box. After downloading the one-click installation script install.sh
from the official website, there is no need to make any modifications. It can be installed directly according to the default configuration, which can already meet the network needs of our common scenarios.
After the installation is complete, we can see that Kube-OVN created the default VPC ovn-cluster
for us, as well as two initial subnets join
and ovn -default
, here we focus on the ovn-default
subnet.
$ kubectl get vpc NAME ENABLEEXTERNAL ENABLEBFD STANDBY SUBNETS NAMESPACES ovn-cluster false false true ["join","ovn-default"] $ kubectl get subnet NAME PROVIDER VPC PROTOCOL CIDR PRIVATE NAT DEFAULT GATEWAYTYPE V4USED V4AVAILABLE V6USED V6AVAILABLE EXCLUDEIPS U2OINTERCONNECTIONIP join ovn ovn-cluster IPv4 100.64.0.0/16 false false false distributed 2 65531 0 0 ["100.64.0.1"] ovn-default ovn ovn-cluster IPv4 10.16.0.0/16 false true true distributed 6 65527 0 0 ["10.16.0.1"]
ovn-default default subnet
For the ovn-default
subnet, we mainly verify three scenarios:
- Check whether the communication between two pods scheduled on the same node is normal;
- Check whether the communication between two pods scheduled on different nodes is normal;
- Is the pod going out of the network normally?
Communication between pods on the same node
We first create two pods: ovn-default-node1-pod1
and ovn-default-node1-pod2
. The nodeSelector of these two pods points to node1.
.
$ cat ovn-default-node1-pod1.yaml apiVersion: v1 Kind: Pod metadata: name: ovn-default-node1-pod1 spec: nodeSelector: kubernetes.io/hostname: node1 containers: - name: nginx image: docker.io/library/nginx:alpine $ cat ovn-default-node1-pod2.yaml apiVersion: v1 Kind: Pod metadata: name: ovn-default-node1-pod2 spec: nodeSelector: kubernetes.io/hostname: node1 containers: - name: nginx image: docker.io/library/nginx:alpine
After creating the pod, you can see that ovn-default-node1-pod1
is assigned the IP of 10.16.0.8
, and ovn-default-node1-pod2
is assigned to 10.16.0.9
. By viewing the CRD of Kube-OVN, you can also see the corresponding IP resource allocation.
$ kubectl apply -f pod1.yaml -f pod2.yaml $ kubectl get po -A -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES default ovn-default-node1-pod1 1/1 Running 0 10s 10.16.0.8 node1 <none> <none> default ovn-default-node1-pod2 1/1 Running 0 10s 10.16.0.9 node1 <none> <none> $ kubectl get ip | grep ovn-default ... ovn-default-node1-pod1.default 10.16.0.8 00:00:00:AA:0D:31 node1 ovn-default ovn-default-node1-pod2.default 10.16.0.9 00:00:00:28:19:46 node1 ovn-default
At this time, we use ovn-default-node1-pod1
to ping the ip 10.16.0.9
of ovn-default-node1-pod2
, and we can confirm It’s OK.
$ kubectl exec -it ovn-default-node1-pod1 -- ping -c 4 10.16.0.9 PING 10.16.0.9 (10.16.0.9): 56 data bytes 64 bytes from 10.16.0.9: seq=0 ttl=64 time=0.128 ms 64 bytes from 10.16.0.9: seq=1 ttl=64 time=0.106 ms 64 bytes from 10.16.0.9: seq=2 ttl=64 time=0.118 ms 64 bytes from 10.16.0.9: seq=3 ttl=64 time=0.120 ms --- 10.16.0.9 ping statistics --- 4 packets transmitted, 4 packets received, 0% packet loss round-trip min/avg/max = 0.106/0.118/0.128 ms
We can briefly analyze why it works. First check the network card information on ovn-default-node1-pod1
:
$ kubectl exec -it ovn-default-node1-pod1 -- ip a ... 26: eth0@if27: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1400 qdisc noqueue state UP link/ether 00:00:00:aa:0d:31 brd ff:ff:ff:ff:ff:ff inet 10.16.0.8/16 brd 10.16.255.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::200:ff:feaa:d31/64 scope link valid_lft forever preferred_lft forever
Through the network card name eth0@if27
, we can realize that this is actually one end of the veth pair. Under the default namespace of node1
we can find the other end ee7cde2f0a44_h@ if26
.
ip link show type veth ... 27: ee7cde2f0a44_h@if26: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP mode DEFAULT group default qlen 1000 link/ether 7a:a1:3d:fd:d3:24 brd ff:ff:ff:ff:ff:ff link-netns cni-1904a3c4-4777-aeda-95af-2cc9b49d178e
Using the same method, we can also find the veth 08a37516ea14_h@if29
under the default namespace of ovn-default-node1-pod1
.
Let’s check the summary information of ovs in Kube-OVN. We can see that ee7cde2f0a44_h@if26
and 08a37516ea14_h@if29
are both inserted in Bridge br-int
, which explains why pods on the same node can communicate normally.
kubectl ko vsctl node1 show fb310a4a-6c6d-4d62-8b08-6826d3b79ca8 Bridgebr-int fail_mode: secure datapath_type: system ... Port ee7cde2f0a44_h Interface ee7cde2f0a44_h Port "08a37516ea14_h" Interface "08a37516ea14_h" ... ovs_version: "3.1.3"
Communication between pods on different nodes
We first create a ovn-default-node2-pod3
that is scheduled on node2
:
$ cat ovn-default-node2-pod3.yaml apiVersion: v1 Kind: Pod metadata: name: ovn-default-node2-pod3 spec: nodeSelector: kubernetes.io/hostname: node2 containers: - name: nginx image: docker.io/library/nginx:alpine $ kubectl get po -A -wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES default ovn-default-node1-pod1 1/1 Running 0 99m 10.16.0.8 node1 <none> <none> default ovn-default-node2-pod3 1/1 Running 0 5s 10.16.0.12 node2 <none> <none> ...
Verify the connectivity of ovn-default-node1-pod1
and ovn-default-node1-pod3
:
$ kubectl exec -it ovn-default-node1-pod1 -- ping -c 4 10.16.0.12 PING 10.16.0.12 (10.16.0.12): 56 data bytes 64 bytes from 10.16.0.12: seq=0 ttl=64 time=2.875 ms 64 bytes from 10.16.0.12: seq=1 ttl=64 time=0.763 ms 64 bytes from 10.16.0.12: seq=2 ttl=64 time=0.668 ms 64 bytes from 10.16.0.12: seq=3 ttl=64 time=0.618 ms --- 10.16.0.12 ping statistics --- 4 packets transmitted, 4 packets received, 0% packet loss round-trip min/avg/max = 0.618/1.231/2.875 ms
Because ovn-default-node1-pod1
and ovn-default-node1-pod3
have been scheduled to different nodes, they cannot be connected through the local Bridge. You need to This is achieved through virtualized tunnel communication technology, which corresponds to GENEVE in Kube-OVN.
We continue to ping ovn-default-node1-pod3
through ovn-default-node1-pod1
, then capture the packet on node2
and observe the packet Structure.
# node1 $ kubectl exec -it ovn-default-node1-pod1 -- ping -c 1 10.16.0.12 PING 10.16.0.12 (10.16.0.12): 56 data bytes 64 bytes from 10.16.0.12: seq=0 ttl=64 time=2.875 ms #node2 $ tcpdump -i enp1s0 geneve ... 10:41:25.322179 IP 192.168.31.29.36694 > node2.6081: Geneve, Flags [C], vni 0x3, options [8 bytes]: IP 10.16.0.8 > 10.16.0.11: ICMP echo request, id 156, seq 0 , length 64 10:41:25.323089 IP node2.25197 > 192.168.31.29.6081: Geneve, Flags [C], vni 0x3, options [8 bytes]: IP 10.16.0.11 > 10.16.0.8: ICMP echo reply, id 156, seq 0 , length 64 ...
In the captured network packets, the source IP and destination IP are both the IP of the host node, and the internal IP data packet encapsulated by GENEVE corresponds to the IP of the pod. You can see that the pod communicates across nodes. , is implemented through GENEVE (more detailed analysis in subsequent articles).
Pod network verification
We use ovn-default-node1-pod1
to verify the external network. We can see that the newly created pod under the default subnet can access the external network normally without any configuration.
kubectl exec -it ovn-default-node1-pod1 -- ping 8.8.8.8 PING 8.8.8.8 (8.8.8.8): 56 data bytes 64 bytes from 8.8.8.8: seq=1 ttl=106 time=165.620 ms 64 bytes from 8.8.8.8: seq=4 ttl=106 time=164.629 ms
First, we capture the data packets of the veth interface associated with ovn-default-node1-pod1
on the host machine. We can see that the source IP is 10.16.0.8
:
tcpdump -i ee7cde2f0a44_h icmp 11:07:58.695344 IP 10.16.0.8 > dns.google: ICMP echo request, id 174, seq 0, length 64 11:07:58.861406 IP dns.google > 10.16.0.8: ICMP echo reply, id 174, seq 0, length 64 ...
Then capture the packets of the physical network card device on the host machine. You can see that the source IP has been replaced from 10.16.0.8
to node1
, which means it has been passed here. NAT translation.
$ tcpdump -i enp1s0 icmp 11:08:28.451918 IP node1 > dns.google: ICMP echo request, id 12504, seq 3, length 64 11:08:28.616974 IP dns.google > node1: ICMP echo reply, id 12504, seq 3, length 64 ...
We check the iptables rules on the host machine and can find the corresponding NAT rule configuration. The subnets under the default VPC ovn-cluster
are all implemented through NAT when going out of the public network. .
$ iptables -t nat -L OVN-MASQUERADE all -- anywhere anywhere match-set ovn40subnets-nat src ! match-set ovn40subnets dst $ ipset list ovn40subnets-nat Name: ovn40subnets-nat Type: hash:net Revision: 7 Header: family inet hashsize 1024 maxelem 1048576 bucketsize 12 initval 0xaa9c4022 Size in memory: 504 References: 2 Number of entries: 1 Members: 10.16.0.0/16 $ ipset list ovn40subnets Name: ovn40subnets Type: hash:net Revision: 7 Header: family inet hashsize 1024 maxelem 1048576 bucketsize 12 initval 0x7d6b4e88 Size in memory: 552 References: 11 Number of entries: 2 Members: 100.64.0.0/16 10.16.0.0/16
Custom subnet
We have just been using the default subnet ovn-default
, and we try to create a new subnet.
$ kubectl create namespace ns1 $ cat subnet.yaml apiVersion: kubeovn.io/v1 Kind: Subnet metadata: name: subnet1 spec: protocol:IPv4 cidrBlock: 10.66.0.0/16 excludeIps: - 10.66.0.1..10.66.0.10 gateway: 10.66.0.1 gatewayType: distributed natOutgoing: true routeTable: "" namespaces: - ns1 #Specify namespace $ kubectl apply -f subnet.yaml $ kubectl get subnet NAME PROVIDER VPC PROTOCOL CIDR PRIVATE NAT DEFAULT GATEWAYTYPE V4USED V4AVAILABLE V6USED V6AVAILABLE EXCLUDEIPS U2OINTERCONNECTIONIP join ovn ovn-cluster IPv4 100.64.0.0/16 false false false distributed 2 65531 0 0 ["100.64.0.1"] ovn-default ovn ovn-cluster IPv4 10.16.0.0/16 false true true distributed 7 65526 0 0 ["10.16.0.1"] subnet1 ovn ovn-cluster IPv4 10.66.0.0/16 false true false distributed 0 65524 0 0 ["10.66.0.1..10.66.0.10"]
The newly created subnet1
subnet is associated with the ns1
namespace, and we create a pod in ns1
:
$ cat subnet1-node1-pod1.yaml apiVersion: v1 Kind: Pod metadata: name: subnet1-node1-pod1 namespace: ns1 spec: nodeSelector: kubernetes.io/hostname: node1 containers: - name: nginx image: docker.io/library/nginx:alpine $ kubectl apply -f subnet1-node1-pod1.yaml
You can see that the pod created under the ns1
namespace has an IP assigned by the subnet1
network segment.
$ kubectl get po -A -owide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ... ns1 subnet1-node1-pod1 1/1 Running 0 5s 10.66.0.11 node1 <none> <none>
Correspondingly, since the natOutgoing
configuration item of subnet1
is true, the public network can also be accessed in subnet1-node1-pod1
.
$ kubectl exec -it subnet1-node1-pod1 -n ns1 -- ping 8.8.8.8 PING 8.8.8.8 (8.8.8.8): 56 data bytes 64 bytes from 8.8.8.8: seq=0 ttl=106 time=167.119 ms 64 bytes from 8.8.8.8: seq=2 ttl=106 time=164.704 ms
Corresponding to the nat rules in iptables, you can see the following changes. This is also the reason why the new subnet1
subnet can go out of the public network through nat.
$ ipset list ovn40subnets-nat Name: ovn40subnets-nat Type: hash:net Revision: 7 Header: family inet hashsize 1024 maxelem 1048576 bucketsize 12 initval 0x4a8ca9d3 Size in memory: 552 References: 2 Number of entries: 2 Members: 10.66.0.0/16 10.16.0.0/16 $ ipset list ovn40subnets Name: ovn40subnets Type: hash:net Revision: 7 Header: family inet hashsize 1024 maxelem 1048576 bucketsize 12 initval 0x58e8fa3a Size in memory: 600 References: 11 Number of entries: 3 Members: 10.66.0.0/16 10.16.0.0/16 100.64.0.0/16
We have introduced the common usage of Kube-OVN subnets here, and the next article will introduce the use of VPC.
If you find it helpful to everyone, you can follow my WeChat public account: Li Ruonian