Understanding Docker: Docker Networking

This series of articles will introduce related knowledge of Docker:

(1)Docker installation and basic usage

(2)Docker image

(3) Isolation of Docker containers – Use Linux namespace to isolate the running environment of the container

(4) Isolation of Docker containers – use cgroups to limit the resources used by the container

(5)Docker network

1. Docker network overview

Use a diagram to illustrate the basic overview of the Docker network:

2. Four single-node network modes

2.1 bridge mode

Docker containers use bridge mode networking by default. Its characteristics are as follows:

  • Use a linux bridge, default is docker0
  • Use a veth pair, one end is in the container’s network namespace and the other end is on docker0
  • In this mode, the Docker Container does not have a public IP because the host’s IP address and the veth pair’s IP address are not in the same network segment.
  • Docker uses NAT to “bind” the service listening port inside the container with a certain port on the host, so that the world outside the host can actively send network packets to the inside of the container.
  • When the outside world accesses the services in the container, it needs to access the IP of the host and the port of the host.
  • Since NAT mode is implemented on a three-layer network, it will definitely affect the transmission efficiency of the network.
  • The container has an independent and isolated network stack; allowing the container and the world outside the host to establish communication through NAT
  • Regarding the principle of containers connecting to the external network through NAT, please refer to my other article Netruon Understanding (11): Using NAT to connect the Linux network namespace to the external network.

The SNTA rules of iptables cause the source IP address of network packets leaving the container to the outside world to be converted to the IP address of the Docker host:

Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
MASQUERADE all -- 172.17.0.0/16 0.0.0.0/0
MASQUERADE all -- 172.18.0.0/16 0.0.0.0/0

The effect is like this:

(image source)

Schematic diagram:

2.2 Host mode

definition:

Host mode does not create an isolated network environment for the container. The reason why it is called host mode is because the Docker container in this mode will share the same network namespace with the host host. Therefore, the Docker Container can use the host’s eth0 to communicate with the outside world, just like the host. In other words, the IP address of the Docker Container is the IP address of the host eth0. Features include:

    • Containers in this mode do not have an isolated network namespace
    • The IP address of the container is the same as the IP address of the Docker host
    • It should be noted that the port number of the service in the container cannot conflict with the port number already used on the Docker host.
    • host mode can coexist with other modes

experiment:

(1) Start a container in host network mode

docker run -d --name hostc1 --network host -p 5001:5001 training/webapp python app.py

(2) Check its network namespace, where you can see all network devices on the host

Copy code

Copy code

root@docker2:/home/sammy# ln -s /proc/28353/ns/net /var/run/netns/hostc1
root@docker2:/home/sammy# ip netns
hostc1
root@docker2:/home/sammy# ip netns exec hostc1
No command specified
root@docker2:/home/sammy# ip netns exec hostc1 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:d4:66:75 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.20/24 brd 192.168.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fed4:6675/64 scope link
       valid_lft forever preferred_lft forever<br>......

Copy code

Copy code

Schematic diagram:

2.3 container mode

definition:

Container network mode is a special network mode in Docker. Docker containers in this mode share the network environment of other containers, so at least there is no network isolation between the two containers, and the two containers are network isolated from the host and other containers.

experiment:

(1) Start a container:

docker run -d –name hostcs1 -p 5001:5001 training/webapp python app.py

(2) Start another container and use the network namespace of the first container

docker run -d –name hostcs2 –network container:hostcs1 training/webapp python app.py

Note: Because the two containers share a network namespace at this time, you need to pay attention to port conflicts, otherwise the second container will not be started.

Schematic diagram:

2.4 none mode

definition:

The network mode is none, which means no network environment is constructed for the Docker container. Once the Docker container adopts the none network mode, only the loopback network device can be used inside the container, and there will be no other network resources. The none network mode of Docker Container means that no network environment is created for the container, and the container can only use the local network of 127.0.0.1.

experiment:

(1) Create and start a container: docker run -d –name hostn1 –network none training/webapp python app.py

(2) Check its network equipment. There is no other equipment except loopback equipment.

Copy code

Copy code

root@docker2:/home/sammy# ip netns exec hostn1 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever

Copy code

Copy code

3. Multi-node Docker network

Docker multi-node network mode can be divided into two categories.One is the VxLAN-based native support for cross-node networks introduced by Docker in version 1.19; the other is a third-party implementation introduced through plug-ins. Solutions, such as Flannel, Calico, etc.

3.1 Docker native overlay network

Docker version 1.19 adds native support for overlay networking. Docker supports three distributed key-value stores: Consul, Etcd, and ZooKeeper. Among them, etcd is a highly available distributed k/v storage system. The default data processed in etcd scenarios is control data. For application data, it is only recommended for situations where the amount of data is small, but updates and access are frequent.

3.1.1 Installation configuration

Prepare three nodes:

  • devstack 192.168.1.18
  • docker1 192.168.1.21
  • docker2 192.168.1.19

Use Docker to start the etcd container on devstack:

Copy code

Copy code

export HostIP="192.168.1.18"
docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380:2380 -p 2379:2379 \
 --name etcd quay.io/coreos/etcd \<br> /usr/local/bin/etcd \
 -name etcd0\
 -advertise-client-urls http://${HostIP}:2379,http://${HostIP}:4001 \
 -listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \
 -initial-advertise-peer-urls http://${HostIP}:2380 \
 -listen-peer-urls http://0.0.0.0:2380 \
 -initial-cluster-token etcd-cluster-1 \
 -initial-cluster etcd0=http://${HostIP}:2380 \
 -initial-cluster-state new

Copy code

Copy code

To start etcd using Docker, please refer to https://coreos.com/etcd/docs/latest/docker_guide.html. However, it should be due to the Dockerfile used to create the image. The command on the official website will cause the startup to fail because the red font part above is missing:

 b847195507addf4fb5a01751eb9c4101416a13db4a8a835e1c2fa1db1e6f364e
docker: Error response from daemon: oci runtime error: exec: "-name": executable file not found in $PATH.

After adding the red part, the container can be created correctly:

root@devstack:/# docker exec -it 179cd52b494d /usr/local/bin/etcdctl cluster-health
member 5d72823aca0e00be is healthy: got healthy result from http://:2379
cluster is healthy

Copy code

Copy code

root@devstack:/home/sammy# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
179cd52b494d quay.io/coreos/etcd "/usr/local/bin/etcd " 8 seconds ago Up 8 seconds 0.0.0.0:2379-2380->2379-2380/tcp, 0.0.0.0:4001->4001/tcp etcd
root@devstack:/home/sammy# netstat -nltp | grep 2380
tcp6 0 0 :::2380 :::* LISTEN 4072/docker-proxy
root@devstack:/home/sammy# netstat -nltp | grep 4001
tcp6 0 0 :::4001 :::* LISTEN 4047/docker-proxy

Copy code

Copy code

Modify /etc/default/docker on docker1 and docker2 nodes and add:

DOCKER_OPTS="--cluster-store=etcd://192.168.1.18:2379 --cluster-advertise=192.168.1.20:2379"

Then restart docker deamon respectively. Note that an IP address must be used; if hostname is used, the docker service will fail to start:

root@docker2:/home/sammy# docker ps
An error occurred trying to connect: Get http:///var/run/docker.sock/v1.24/containers/json: read unix @->/var/run/docker.sock: read: connection reset by peer< /pre>
  
 <h5>3.1.2 Using Docker overlay network</h5>
 <p>(1) Run the following command on docker1 to create an overlay network:</p>
  
   
    
      
      <img src="//i2.wp.com/common.cnblogs.com/images/copycode.gif" alt="Copy code" style="outline: none;">
      
   
   
    
      
      <img src="//i2.wp.com/common.cnblogs.com/images/copycode.gif" alt="Copy code" style="outline: none;">
      
   
  <pre>root@docker1:/home/sammy# docker network create -d overlay overlaynet1
1de982804f632169380609b9be7c1466b0064dce661a8f4c9e30d781e79fc45a
root@docker1:/home/sammy# docker network inspect overlaynet1
[
    {
        "Name": "overlaynet1",
        "Id": "1de982804f632169380609b9be7c1466b0064dce661a8f4c9e30d781e79fc45a",
        "Scope": "global",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "10.0.0.0/24",
                    "Gateway": "10.0.0.1/24"
                }
            ]
        },
        "Internal": false,
        "Containers": {},
        "Options": {},
        "Labels": {}
    }
]

Copy code

Copy code

You will also see this network on docker2, which means that through etcd, the network data is distributed instead of local.

(2) Create a container in the network

On docker2, run docker run -d –name over2 –network overlaynet1 training/webapp python app.py

On docker1, run docker run -d –name over1 –network overlaynet1 training/webapp python app.py

Enter the container over2 and find that it has two network cards:

Copy code

Copy code

root@docker2:/home/sammy# ln -s /proc/23576/ns/net /var/run/netns/over2
root@docker2:/home/sammy# ip netns
over2
root@docker2:/home/sammy# ip netns exec over2 ip a

22: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
    link/ether 02:42:0a:00:00:02 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.2/24 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:aff:fe00:2/64 scope link
       valid_lft forever preferred_lft forever
24: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:13:00:02 brd ff:ff:ff:ff:ff:ff
    inet 172.19.0.2/16 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe13:2/64 scope link
       valid_lft forever preferred_lft forever

Copy code

Copy code

The network of eth1 is an internal network segment. In fact, it uses the ordinary NAT mode; and eth0 is the IP address assigned on the overlay network segment, that is, it uses the overlay network. The MTU is 1450 not 1500.

Looking further at its routing table, you will find that only communication between containers in the same overlay network will pass through eth0, and all other communications will still go through eth1.

Copy code

root@docker2:/home/sammy# ip netns exec over2 route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.19.0.1 0.0.0.0 UG 0 0 0 eth1
10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
172.19.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth1

Copy code

First look at the network topology diagram at this time:

visible:

  • Docker creates two linux bridges on each node, one for the overlay network (ov-000100-1de98) and one for the non-overlay NAT network (docker_gwbridge)
  • Network traffic within the container to other containers in the overlay network goes through the overlay network card (eth0), and other network traffic goes through the NAT network card (eth1)
  • Currently, the ID range for Docker to create vxlan tunnels is 256~1000, so up to 745 networks can be created. Therefore, the ID used for the vxlan tunnel in this example is 256
  • Docker vxlan driver uses 4789 UDP port
  • The bottom layer of the overlay network model requires a KV storage system like consul or etcd for message synchronization.
  • Docker overlay does not use multicast
  • Containers in the Overlay network are in a virtual large layer 2 network
  • Regarding linux bridge + vxlan networking, please refer to Neutron Understanding (14): Neutron ML2 + Linux bridge + VxLAN networking
  • For linux network namspace + NAT networking, please refer to Netruon Understanding (11): Use NAT to connect the Linux network namespace to the external network
  • The code on github is here https://github.com/docker/libnetwork/blob/master/drivers/overlay/

Initial case for ov-000100-1de98:

Copy code

root@docker1:/home/sammy# ip -d link show dev vx-000100-1de98
8: vx-000100-1de98: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master ov-000100-1de98 state UNKNOWN mode DEFAULT group default
    link/ether 22:3c:3f:8f:94:f6 brd ff:ff:ff:ff:ff:ff promiscuity 1
    vxlan id 256 port 32768 61000 proxy l2miss l3miss ageing 300
root@docker1:/home/sammy# bridge fdb show dev vx-000100-1de98
22:3c:3f:8f:94:f6 vlan 0 permanent

Copy code

An obvious problem here is that the fdb table content of vxlan dev vx-000100-1de98 is incomplete, causing the ping from container 1 to container 2 to fail. The solutions to be chosen are nothing more than the following:

  • Use a central database that stores the mapping between the IP addresses of all containers and the IP addresses of the nodes where they are located.
  • Use multicast
  • Use special protocols such as BGP to advertise the mapping relationship between the IP of the container and the IP of the node where it is located.

Docker uses a combination of the first and third methods to some extent. First, Docker uses distributed key/value storage such as consul and etcd to save IP address mapping relationships. Docker nodes also directly advertise mapping relationships through a certain protocol.

For testing, I restarted the docker1 node and found that the over1 container could not be started. The error was reported as follows:

docker: Error response from daemon: network sandbox join failed: could not get network sandbox (oper true): failed get network namespace "": no such file or directory.

According to https://github.com/docker/docker/issues/25215, this is a bug in Docker and the fix has just been rolled out. One workaround is to recreate the overlay network.

Returning to the problem of containers being unable to ping each other, we still don’t know what the root cause is (I want to complain about Docker’s current problems). For mutual ping to work, at least the following conditions must be met:

On docker1,

  • Add an fdb entry for vxlan dev: 02:42:14:00:00:03 dst 192.168.1.20 self
  • Add an arp entry in the container: ip netns exec over1 arp -s 20.0.0.3 02:42:14:00:00:03

On docker 2,

  • Add an fdb entry for vxlan dev: 02:42:14:00:00:02 dst 192.168.1.21 self permanent
  • Add an arp entry in the container: ip netns exec over4 arp -s 20.0.0.2 02:42:14:00:00:02

3. Network performance comparison

3.1 Data in my test environment

Use the iperf tool to check and test the performance and compare:

Type TCP UDP
Overlay between two containers in the network (A) 913 Mbits/sec 1.05 Mbits/sec
Between two containers in Bridge/NAT network (B) 1.73 Gbits/sec
Host to Host (C) 2.06 Gbits/sec 1.05 Mbits/sec
Container from host to bridge network mode on another host (D) 1.88 Gbits/sec
Host to container on this host (E) 20.5 Gbits/sec
Host to another host host network mode container (F) 2.02 Gbits/sec 1.05 Mbits/sec
Container Overlay efficiency (A/C) 44% 100% ?
Single NAT efficiency (D /C) 91%
Two NAT efficiency (B/C)< /strong> 83%
Host network mode efficiency (F/C) 98% 100%

The two hosts are two virtual machines on the same physical machine. Therefore, the absolute value of the result is actually meaningless, and the relative value has a certain reference value.

3.2 Comparative data in online articles

The article Testing Docker multi-host network performance compares the performance of various network modes. The results are as follows:

It seems that the data in this table is not much different from the data in my table.

3.3 Simple conclusions about Docker network mode selection

  • The performance loss of Bridge mode is about 10%
  • The performance penalty of native overlay mode is very high, even reaching 56%, so you need to be very cautious when using this mode in a production environment.
  • If you must use the overlay mode, you can consider using the Calico mode initiated by Cisco. Its performance is equivalent to that of the bridge.
  • The performance data for Weave overlay mode is very questionable and should not be so bad.

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. Cloud native entry-level skill treeContainer (docker)Install docker16750 people are learning the system