Kube-OVN Series – Use of Subnet

In this blog we introduce the use of Kube-OVN’s Subnet. If you haven’t set up a Kubernetes + Kube-OVN environment yet, you can refer to my previous article: Use Kubeadm to build a Kubernetes cluster.

Initial VPC and Subnet

Kube-OVN is ready to use out of the box. After downloading the one-click installation script install.sh from the official website, there is no need to make any modifications. It can be installed directly according to the default configuration, which can already meet the network needs of our common scenarios.
After the installation is complete, we can see that Kube-OVN created the default VPC ovn-cluster for us, as well as two initial subnets join and ovn -default, here we focus on the ovn-default subnet.

$ kubectl get vpc
NAME ENABLEEXTERNAL ENABLEBFD STANDBY SUBNETS NAMESPACES
ovn-cluster false false true ["join","ovn-default"]

$ kubectl get subnet
NAME PROVIDER VPC PROTOCOL CIDR PRIVATE NAT DEFAULT GATEWAYTYPE V4USED V4AVAILABLE V6USED V6AVAILABLE EXCLUDEIPS U2OINTERCONNECTIONIP
join ovn ovn-cluster IPv4 100.64.0.0/16 false false false distributed 2 65531 0 0 ["100.64.0.1"]
ovn-default ovn ovn-cluster IPv4 10.16.0.0/16 false true true distributed 6 65527 0 0 ["10.16.0.1"]

ovn-default default subnet

For the ovn-default subnet, we mainly verify three scenarios:

  • Check whether the communication between two pods scheduled on the same node is normal;
  • Check whether the communication between two pods scheduled on different nodes is normal;
  • Is the pod going out of the network normally?

Communication between pods on the same node

We first create two pods: ovn-default-node1-pod1 and ovn-default-node1-pod2. The nodeSelector of these two pods points to node1. .

$ cat ovn-default-node1-pod1.yaml
apiVersion: v1
Kind: Pod
metadata:
  name: ovn-default-node1-pod1
spec:
  nodeSelector:
    kubernetes.io/hostname: node1
  containers:
    - name: nginx
      image: docker.io/library/nginx:alpine

$ cat ovn-default-node1-pod2.yaml
apiVersion: v1
Kind: Pod
metadata:
  name: ovn-default-node1-pod2
spec:
  nodeSelector:
    kubernetes.io/hostname: node1
  containers:
    - name: nginx
      image: docker.io/library/nginx:alpine

After creating the pod, you can see that ovn-default-node1-pod1 is assigned the IP of 10.16.0.8, and ovn-default-node1-pod2is assigned to 10.16.0.9. By viewing the CRD of Kube-OVN, you can also see the corresponding IP resource allocation.

$ kubectl apply -f pod1.yaml -f pod2.yaml
$ kubectl get po -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default ovn-default-node1-pod1 1/1 Running 0 10s 10.16.0.8 node1 <none> <none>
default ovn-default-node1-pod2 1/1 Running 0 10s 10.16.0.9 node1 <none> <none>

$ kubectl get ip | grep ovn-default
...
ovn-default-node1-pod1.default 10.16.0.8 00:00:00:AA:0D:31 node1 ovn-default
ovn-default-node1-pod2.default 10.16.0.9 00:00:00:28:19:46 node1 ovn-default

At this time, we use ovn-default-node1-pod1 to ping the ip 10.16.0.9 of ovn-default-node1-pod2, and we can confirm It’s OK.

$ kubectl exec -it ovn-default-node1-pod1 -- ping -c 4 10.16.0.9
PING 10.16.0.9 (10.16.0.9): 56 data bytes
64 bytes from 10.16.0.9: seq=0 ttl=64 time=0.128 ms
64 bytes from 10.16.0.9: seq=1 ttl=64 time=0.106 ms
64 bytes from 10.16.0.9: seq=2 ttl=64 time=0.118 ms
64 bytes from 10.16.0.9: seq=3 ttl=64 time=0.120 ms

--- 10.16.0.9 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.106/0.118/0.128 ms

We can briefly analyze why it works. First check the network card information on ovn-default-node1-pod1:

$ kubectl exec -it ovn-default-node1-pod1 -- ip a
...
26: eth0@if27: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1400 qdisc noqueue state UP
    link/ether 00:00:00:aa:0d:31 brd ff:ff:ff:ff:ff:ff
    inet 10.16.0.8/16 brd 10.16.255.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::200:ff:feaa:d31/64 scope link
       valid_lft forever preferred_lft forever

Through the network card name eth0@if27, we can realize that this is actually one end of the veth pair. Under the default namespace of node1 we can find the other end ee7cde2f0a44_h@ if26.

ip link show type veth
...
27: ee7cde2f0a44_h@if26: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue master ovs-system state UP mode DEFAULT group default qlen 1000
    link/ether 7a:a1:3d:fd:d3:24 brd ff:ff:ff:ff:ff:ff link-netns cni-1904a3c4-4777-aeda-95af-2cc9b49d178e

Using the same method, we can also find the veth 08a37516ea14_h@if29 under the default namespace of ovn-default-node1-pod1.
Let’s check the summary information of ovs in Kube-OVN. We can see that ee7cde2f0a44_h@if26 and 08a37516ea14_h@if29 are both inserted in Bridge br-int, which explains why pods on the same node can communicate normally.

kubectl ko vsctl node1 show
fb310a4a-6c6d-4d62-8b08-6826d3b79ca8
    Bridgebr-int
        fail_mode: secure
        datapath_type: system
        ...
        Port ee7cde2f0a44_h
            Interface ee7cde2f0a44_h
        Port "08a37516ea14_h"
            Interface "08a37516ea14_h"
        ...
    ovs_version: "3.1.3"

Communication between pods on different nodes

We first create a ovn-default-node2-pod3 that is scheduled on node2:

$ cat ovn-default-node2-pod3.yaml
apiVersion: v1
Kind: Pod
metadata:
  name: ovn-default-node2-pod3
spec:
  nodeSelector:
    kubernetes.io/hostname: node2
  containers:
    - name: nginx
      image: docker.io/library/nginx:alpine
      
$ kubectl get po -A -wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default ovn-default-node1-pod1 1/1 Running 0 99m 10.16.0.8 node1 <none> <none>
default ovn-default-node2-pod3 1/1 Running 0 5s 10.16.0.12 node2 <none> <none>
...

Verify the connectivity of ovn-default-node1-pod1 and ovn-default-node1-pod3:

$ kubectl exec -it ovn-default-node1-pod1 -- ping -c 4 10.16.0.12
PING 10.16.0.12 (10.16.0.12): 56 data bytes
64 bytes from 10.16.0.12: seq=0 ttl=64 time=2.875 ms
64 bytes from 10.16.0.12: seq=1 ttl=64 time=0.763 ms
64 bytes from 10.16.0.12: seq=2 ttl=64 time=0.668 ms
64 bytes from 10.16.0.12: seq=3 ttl=64 time=0.618 ms

--- 10.16.0.12 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.618/1.231/2.875 ms

Because ovn-default-node1-pod1 and ovn-default-node1-pod3 have been scheduled to different nodes, they cannot be connected through the local Bridge. You need to This is achieved through virtualized tunnel communication technology, which corresponds to GENEVE in Kube-OVN.

We continue to ping ovn-default-node1-pod3 through ovn-default-node1-pod1, then capture the packet on node2 and observe the packet Structure.

# node1
$ kubectl exec -it ovn-default-node1-pod1 -- ping -c 1 10.16.0.12
PING 10.16.0.12 (10.16.0.12): 56 data bytes
64 bytes from 10.16.0.12: seq=0 ttl=64 time=2.875 ms

#node2
$ tcpdump -i enp1s0 geneve
...
10:41:25.322179 IP 192.168.31.29.36694 > node2.6081: Geneve, Flags [C], vni 0x3, options [8 bytes]: IP 10.16.0.8 > 10.16.0.11: ICMP echo request, id 156, seq 0 , length 64
10:41:25.323089 IP node2.25197 > 192.168.31.29.6081: Geneve, Flags [C], vni 0x3, options [8 bytes]: IP 10.16.0.11 > 10.16.0.8: ICMP echo reply, id 156, seq 0 , length 64
...

In the captured network packets, the source IP and destination IP are both the IP of the host node, and the internal IP data packet encapsulated by GENEVE corresponds to the IP of the pod. You can see that the pod communicates across nodes. , is implemented through GENEVE (more detailed analysis in subsequent articles).

Pod network verification

We use ovn-default-node1-pod1 to verify the external network. We can see that the newly created pod under the default subnet can access the external network normally without any configuration.

kubectl exec -it ovn-default-node1-pod1 -- ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=1 ttl=106 time=165.620 ms
64 bytes from 8.8.8.8: seq=4 ttl=106 time=164.629 ms

First, we capture the data packets of the veth interface associated with ovn-default-node1-pod1 on the host machine. We can see that the source IP is 10.16.0.8:

tcpdump -i ee7cde2f0a44_h icmp
11:07:58.695344 IP 10.16.0.8 > dns.google: ICMP echo request, id 174, seq 0, length 64
11:07:58.861406 IP dns.google > 10.16.0.8: ICMP echo reply, id 174, seq 0, length 64
...

Then capture the packets of the physical network card device on the host machine. You can see that the source IP has been replaced from 10.16.0.8 to node1, which means it has been passed here. NAT translation.

$ tcpdump -i enp1s0 icmp
11:08:28.451918 IP node1 > dns.google: ICMP echo request, id 12504, seq 3, length 64
11:08:28.616974 IP dns.google > node1: ICMP echo reply, id 12504, seq 3, length 64
...

We check the iptables rules on the host machine and can find the corresponding NAT rule configuration. The subnets under the default VPC ovn-cluster are all implemented through NAT when going out of the public network. .

$ iptables -t nat -L
OVN-MASQUERADE all -- anywhere anywhere match-set ovn40subnets-nat src ! match-set ovn40subnets dst

$ ipset list ovn40subnets-nat
Name: ovn40subnets-nat
Type: hash:net
Revision: 7
Header: family inet hashsize 1024 maxelem 1048576 bucketsize 12 initval 0xaa9c4022
Size in memory: 504
References: 2
Number of entries: 1
Members:
10.16.0.0/16

$ ipset list ovn40subnets
Name: ovn40subnets
Type: hash:net
Revision: 7
Header: family inet hashsize 1024 maxelem 1048576 bucketsize 12 initval 0x7d6b4e88
Size in memory: 552
References: 11
Number of entries: 2
Members:
100.64.0.0/16
10.16.0.0/16

Custom subnet

We have just been using the default subnet ovn-default, and we try to create a new subnet.

$ kubectl create namespace ns1
$ cat subnet.yaml
apiVersion: kubeovn.io/v1
Kind: Subnet
metadata:
  name: subnet1
spec:
  protocol:IPv4
  cidrBlock: 10.66.0.0/16
  excludeIps:
  - 10.66.0.1..10.66.0.10
    gateway: 10.66.0.1
    gatewayType: distributed
    natOutgoing: true
    routeTable: ""
    namespaces:
  - ns1 #Specify namespace

$ kubectl apply -f subnet.yaml
$ kubectl get subnet
NAME PROVIDER VPC PROTOCOL CIDR PRIVATE NAT DEFAULT GATEWAYTYPE V4USED V4AVAILABLE V6USED V6AVAILABLE EXCLUDEIPS U2OINTERCONNECTIONIP
join ovn ovn-cluster IPv4 100.64.0.0/16 false false false distributed 2 65531 0 0 ["100.64.0.1"]
ovn-default ovn ovn-cluster IPv4 10.16.0.0/16 false true true distributed 7 65526 0 0 ["10.16.0.1"]
subnet1 ovn ovn-cluster IPv4 10.66.0.0/16 false true false distributed 0 65524 0 0 ["10.66.0.1..10.66.0.10"]

The newly created subnet1 subnet is associated with the ns1 namespace, and we create a pod in ns1:

$ cat subnet1-node1-pod1.yaml
apiVersion: v1
Kind: Pod
metadata:
  name: subnet1-node1-pod1
  namespace: ns1
spec:
  nodeSelector:
    kubernetes.io/hostname: node1
  containers:
    - name: nginx
      image: docker.io/library/nginx:alpine
$ kubectl apply -f subnet1-node1-pod1.yaml

You can see that the pod created under the ns1 namespace has an IP assigned by the subnet1 network segment.

$ kubectl get po -A -owide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
...
ns1 subnet1-node1-pod1 1/1 Running 0 5s 10.66.0.11 node1 <none> <none>

Correspondingly, since the natOutgoing configuration item of subnet1 is true, the public network can also be accessed in subnet1-node1-pod1.

$ kubectl exec -it subnet1-node1-pod1 -n ns1 -- ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=106 time=167.119 ms
64 bytes from 8.8.8.8: seq=2 ttl=106 time=164.704 ms

Corresponding to the nat rules in iptables, you can see the following changes. This is also the reason why the new subnet1 subnet can go out of the public network through nat.

$ ipset list ovn40subnets-nat
Name: ovn40subnets-nat
Type: hash:net
Revision: 7
Header: family inet hashsize 1024 maxelem 1048576 bucketsize 12 initval 0x4a8ca9d3
Size in memory: 552
References: 2
Number of entries: 2
Members:
10.66.0.0/16
10.16.0.0/16

$ ipset list ovn40subnets
Name: ovn40subnets
Type: hash:net
Revision: 7
Header: family inet hashsize 1024 maxelem 1048576 bucketsize 12 initval 0x58e8fa3a
Size in memory: 600
References: 11
Number of entries: 3
Members:
10.66.0.0/16
10.16.0.0/16
100.64.0.0/16

We have introduced the common usage of Kube-OVN subnets here, and the next article will introduce the use of VPC.

If you find it helpful to everyone, you can follow my WeChat public account: Li Ruonian