Characteristics of tun/tap and veth devices of Linux virtual network devices

In the current cloud era, virtual machines and containers are everywhere, and the network management behind them is inseparable from virtual network devices. Therefore, understanding virtual network devices will help us better understand the network structure in the cloud era. Starting from this article, we will introduce virtual network devices under Linux.

The difference between virtual devices and physical devices

In these two articles, the receiving process of Linux network data packets and the sending process of data packets, the process of sending and receiving data packets is introduced. We know that there is a network device management layer in the Linux kernel, which is between the network device driver and the protocol stack. Responsible for connecting data interactions between them. The driver does not need to know the details of the protocol stack, and the protocol stack does not need to know the details of the device driver.

For a network device, like a pipe, there are two ends, and data received from either end will be sent out from the other end.

For example, a physical network card eth0 has two ends: the kernel protocol stack (indirect communication through the kernel network device management module) and the external physical network. Data received from the physical network will be forwarded to the kernel protocol stack, and the application The data sent from the protocol stack will be sent through the physical network.

What about a virtual network device? First of all, it is also managed by the network device management subsystem of the kernel. For the Linux kernel network device management module, there is no difference between virtual devices and physical devices. They are both network devices and can be configured with IP. Data from network devices will be forwarded. To the protocol stack, the data coming from the protocol stack will also be sent out by the network device. As for how it is sent and where it is sent, that is a matter of the device driver and has nothing to do with the Linux kernel, so it is called virtual network device One end of it is also a protocol stack, and what is on the other end depends on the driver implementation of the virtual network device.

What is the other end of tun/tap?

Look at the picture first and then talk:

 + -------------------------------------------------- ------------------ +
| |
| + -------------------- + + -------------------- + |
| | User Application A | | User Application B |<----- + |
| + -------------------- + + -------------------- + | |
| | 1 | 5 | |
|.............|........................|............. .........|....|
| ↓ ↓ | |
| + ---------- + + ---------- + | |
| | socket A | | socket B | | |
| + ---------- + + ---------- + | |
| | 2 | 6 | |
|.................|..................|............. .........|....|
| ↓ ↓ | |
| + -------------------------- + 4 | |
| | Newwork Protocol Stack | | |
| + ------------------------ + | |
| | 7 | 3 | |
|............|................................|............ .........|....|
| ↓ ↓ | |
| + ---------------- + + ---------------- + | |
| | eth0 | | tun0 | | |
| + ---------------- + + ---------------- + | |
| 10.32.0.11 | | 192.168.3.11 | |
| | 8 + --------------------- + |
| | |
 +----------------|-------------------------------- --------------- +
                 ↓
         Physical Network

In the picture above, there are two applications A and B, both of which are in the user layer, while other sockets, protocol stacks (Newwork Protocol Stack) and network devices (eth0 and tun0) are all in the kernel layer. In fact, sockets are part of the protocol stack. , the purpose of separation here is to make it more intuitive.

tun0 is a Tun/Tap virtual device. From the above figure, you can see the difference between it and the physical device eth0. Although one end of them is connected to the protocol stack, the other end is different. The other end of eth0 is the physical network. This physical device The network may be a switch, and the other end of tun0 is a user-level program. The data packets sent to tun0 by the protocol stack can be read by the application, and the application can directly write data to tun0.

It is assumed here that the IP configured on eth0 is 10.32.0.11, and the IP configured on tun0 is 192.168.3.11.

Listed here is a typical application scenario of the tun/tap device. The data sent to the 192.168.3.0/24 network passes through the tunnel of program B and is sent to 10.33.0.1 of the remote network using 10.32.0.11, and then through 10.33.0.1 Forwarded to the corresponding device to implement VPN.

Let’s take a look at the flow of the data packet:

  1. Application A is an ordinary program that sends a data packet through socket A. Assume that the destination IP address of this data packet is 192.168.3.1

  2. The socket throws this data packet to the protocol stack

  3. The protocol stack matches the local routing rules based on the destination IP address of the data packet. It knows that the data packet should go out through tun0, so it hands the data packet to tun0.

  4. After tun0 received the data packet, it found that the other end was opened by process B, so it threw the data packet to process B.

  5. After process B receives the data packet, it does some business-related processing, then constructs a new data packet, embeds the original data packet in the new data packet, and finally forwards the data packet through socket B. At this time, the new data packet is The source address of the data packet becomes the address of eth0, and the destination IP address becomes another address, such as 10.33.0.1.

  6. socket B throws the data packet to the protocol stack

  7. Based on the local routing, the protocol stack found that this data packet should be sent through eth0, so it handed the data packet to eth0

  8. eth0 sends the packet out through the physical network

After 10.33.0.1 receives the data packet, it will open the data packet, read the original data packet inside, and forward it to the local 192.168.3.1. Then, after receiving the response from 192.168.3.1, it will construct a new response packet and Encapsulate the original response packet inside, and then return it to application B through the original path. Application B takes out the original response packet inside, and finally returns it to application A.

We will not discuss here how the Tun/Tap device tun0 communicates with user-level process B. For the Linux kernel, there are many ways to exchange data between kernel space and user-space processes.

As can be seen from the above process, which network device the data packet chooses to go to is completely controlled by the routing table, so if we want some network traffic to go through the forwarding process of application B, we need to configure the routing table to let this part of the data go through tun0 .

What is the use of un/tap device?

As can be seen from the process introduced above, the purpose of the tun/tap device is to forward some data packets in the protocol stack to user-space applications, giving user-space programs a chance to process the data packets. Therefore, the more commonly used data compression, encryption and other functions can be implemented in application B. The most commonly used scenario for tun/tap devices is VPN, including tunnel and application layer IPSec. The more famous project is VTun. If you are interested, you can Go find out.

The difference between tun and tap

User-level programs can only read and write IP data packets through the tun device, while they can read and write link layer data packets through the tap device. Similar to the difference between ordinary sockets and raw sockets, the formats for processing data packets are different.

Features of veth devices

  • veth is the same as other network devices. One end is connected to the kernel protocol stack.
  • veth devices appear in pairs, and the two devices at the other end are connected to each other
  • After a device receives a data sending request from the protocol stack, it will send the data to another device.

The following relationship diagram clearly illustrates the characteristics of veth devices:

 + -------------------------------------------------- ------------------ +
| |
| +------------------------------------------------ + |
| | Newwork Protocol Stack | |
| +------------------------------------------------ + |
| ↑ ↑ ↑ |
|............|............|.............|.. .............|
| ↓ ↓ ↓ |
| + ---------- + + ----------- + + ----------- + |
| | eth0 | | veth0 | | veth1 | |
| + ---------- + + ----------- + + ----------- + |
|192.168.1.11 ↑ ↑ ↑ |
| | + --------------- + |
| | 192.168.2.11 192.168.2.1 |
 +----------------|---------------------------------- --------------- +
               ↓
         Physical Network

In the above figure, the IP we configured for the physical network card eth0 is 192.168.1.11, and the IPs of veth0 and veth1 are 192.168.2.11 and 192.168.2.1 respectively.

Let’s take a step-by-step look at the characteristics of veth devices through examples.

Configure IP for only one veth device

First add veth0 and veth1 through the ip link command, then configure the IP of veth0, and start both devices.

dev@debian:~$ sudo ip link add veth0 type veth peer name veth1
dev@debian:~$ sudo ip addr add 192.168.2.11/24 dev veth0
dev@debian:~$ sudo ip link set veth0 up
dev@debian:~$ sudo ip link set veth1 up

The reason why the IP is not configured for the veth1 device here is to see if veth0 will forward the data from the protocol stack to veth1 when veth1 does not have an IP.

Ping 192.168.2.1. Since veth1 has not been configured with an IP, it will definitely not work.

dev@debian:~$ ping -c 4 192.168.2.1
PING 192.168.2.1 (192.168.2.1) 56(84) bytes of data.
From 192.168.2.11 icmp_seq=1 Destination Host Unreachable
From 192.168.2.11 icmp_seq=2 Destination Host Unreachable
From 192.168.2.11 icmp_seq=3 Destination Host Unreachable
From 192.168.2.11 icmp_seq=4 Destination Host Unreachable

--- 192.168.2.1 ping statistics ---
4 packets transmitted, 0 received, + 4 errors, 100% packet loss, time 3015ms
pipe 3

But why can’t ping? At what point did you fail?

Let’s first look at the packet capture. As can be seen from the output below, veth0 and veth1 received the same ARP request packet, but did not see the ARP response packet:

dev@debian:~$ sudo tcpdump -n -i veth0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth0, link-type EN10MB (Ethernet), capture size 262144 bytes
20:20:18.285230 ARP, Request who-has 192.168.2.1 tell 192.168.2.11, length 28
20:20:19.282018 ARP, Request who-has 192.168.2.1 tell 192.168.2.11, length 28
20:20:20.282038 ARP, Request who-has 192.168.2.1 tell 192.168.2.11, length 28
20:20:21.300320 ARP, Request who-has 192.168.2.1 tell 192.168.2.11, length 28
20:20:22.298783 ARP, Request who-has 192.168.2.1 tell 192.168.2.11, length 28
20:20:23.298923 ARP, Request who-has 192.168.2.1 tell 192.168.2.11, length 28

dev@debian:~$ sudo tcpdump -n -i veth1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth1, link-type EN10MB (Ethernet), capture size 262144 bytes
20:20:48.570459 ARP, Request who-has 192.168.2.1 tell 192.168.2.11, length 28
20:20:49.570012 ARP, Request who-has 192.168.2.1 tell 192.168.2.11, length 28
20:20:50.570023 ARP, Request who-has 192.168.2.1 tell 192.168.2.11, length 28
20:20:51.570023 ARP, Request who-has 192.168.2.1 tell 192.168.2.11, length 28
20:20:52.569988 ARP, Request who-has 192.168.2.1 tell 192.168.2.11, length 28
20:20:53.570833 ARP, Request who-has 192.168.2.1 tell 192.168.2.11, length 28

Why is this so? It becomes clear once you understand what happens behind the ping:

  1. The ping process constructs an ICMP echo request packet and sends it to the protocol stack through the socket.
  2. Based on the destination IP address and the system routing table, the protocol stack knows that the data packet destined for 192.168.2.1 should go out through the 192.168.2.11 port.
  3. Since this is the first time to access 192.168.2.1, and the destination IP and local IP are in the same network segment, the protocol stack will first send ARP to ask for the mac address of 192.168.2.1
  4. The protocol stack hands the ARP packet to veth0 and lets it be sent out.
  5. Since the other end of veth0 is connected to veth1, the ARP request packet is forwarded to veth1
  6. After veth1 receives the ARP packet, it forwards it to the protocol stack at the other end.
  7. The protocol stack looked at its own device list and found that the IP 192.168.2.1 did not exist locally, so it discarded the ARP request packet. This is why only the ARP request packet can be seen, but not the response packet.

Configure IPs for both veth devices

Also configure IP for veth1

dev@debian:~$ sudo ip addr add 192.168.2.1/24 dev veth1

Then ping 192.168.2.1 successfully (since 192.168.2.1 is a local IP, it will go to the lo device by default. In order to avoid this situation, the ping command is used here with the -I parameter to specify that the data packet goes to the specified device)

dev@debian:~$ ping -c 4 192.168.2.1 -I veth0
PING 192.168.2.1 (192.168.2.1) from 192.168.2.11 veth0: 56(84) bytes of data.
64 bytes from 192.168.2.1: icmp_seq=1 ttl=64 time=0.032 ms
64 bytes from 192.168.2.1: icmp_seq=2 ttl=64 time=0.048 ms
64 bytes from 192.168.2.1: icmp_seq=3 ttl=64 time=0.055 ms
64 bytes from 192.168.2.1: icmp_seq=4 ttl=64 time=0.050 ms

--- 192.168.2.1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3002ms
rtt min/avg/max/mdev = 0.032/0.046/0.055/0.009 ms
Note: For non-debian systems, ping may not work here. This is mainly because some ARP related configurations in the kernel cause veth1 to not return ARP response packets. For example, this situation will occur on ubuntu. The solution is as follows:
root@ubuntu:~# echo 1 > /proc/sys/net/ipv4/conf/veth1/accept_local
root@ubuntu:~# echo 1 > /proc/sys/net/ipv4/conf/veth0/accept_local
root@ubuntu:~# echo 0 > /proc/sys/net/ipv4/conf/all/rp_filter
root@ubuntu:~# echo 0 > /proc/sys/net/ipv4/conf/veth0/rp_filter
root@ubuntu:~# echo 0 > /proc/sys/net/ipv4/conf/veth1/rp_filter

Let’s take a look at the packet capture situation. We saw ICMP echo request packets on both veth0 and veth1, but why is there no response packet? Doesn’t the above show that the ping process has successfully received the response packet?

dev@debian:~$ sudo tcpdump -n -i veth0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth0, link-type EN10MB (Ethernet), capture size 262144 bytes
20:23:43.113062 IP 192.168.2.11 > 192.168.2.1: ICMP echo request, id 24169, seq 1, length 64
20:23:44.112078 IP 192.168.2.11
> 192.168.2.1: ICMP echo request, id 24169, seq 2, length 64
20:23:45.111091 IP 192.168.2.11 > 192.168.2.1: ICMP echo request, id 24169, seq 3, length 64
20:23:46.110082 IP 192.168.2.11 > 192.168.2.1: ICMP echo request, id 24169, seq 4, length 64


dev@debian:~$ sudo tcpdump -n -i veth1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth1, link-type EN10MB (Ethernet), capture size 262144 bytes
20:24:12.221372 IP 192.168.2.11 > 192.168.2.1: ICMP echo request, id 24174, seq 1, length 64
20:24:13.222089 IP 192.168.2.11 > 192.168.2.1: ICMP echo request, id 24174, seq 2, length 64
20:24:14.224836 IP 192.168.2.11 > 192.168.2.1: ICMP echo request, id 24174, seq 3, length 64
20:24:15.223826 IP 192.168.2.11 > 192.168.2.1: ICMP echo request, id 24174, seq 4, length 64

You will understand if you look at the flow of the data packet:

  1. The ping process constructs an ICMP echo request packet and sends it to the protocol stack through the socket.
  2. Since the ping program specifies to go to veth0, and there are already relevant records in the local ARP cache, there is no need to send ARP out anymore. The protocol stack directly delivers the packet to veth0.
  3. Since the other end of veth0 is connected to veth1, the ICMP echo request packet is forwarded to veth1.
  4. After veth1 receives the ICMP echo request packet, it transfers it to the protocol stack at the other end.
  5. The protocol stack looked at its own device list and found that the local IP address was 192.168.2.1, so it constructed an ICMP echo response packet and prepared to return
  6. The protocol stack checks its own routing table and finds that the data packet returned to 192.168.2.11 should go through the lo port, so it hands the response packet to the lo device.
  7. After lo received the response packet from the protocol stack, he did nothing. He changed hands and returned the data packet to the protocol stack (equivalent to the protocol stack passing the data packet to lo through the sending process, and then lo handed the data packet to the protocol stack. receiving process)
  8. After receiving the response packet, the protocol stack found that a socket needed the packet, so it handed it over to the corresponding socket.
  9. This socket happened to be the socket created by the ping process, so the ping process received the response packet.

Grabbing the data on the lo device, we found that the response packet did come back from the lo port:

dev@debian:~$ sudo tcpdump -n -i lo
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes
20:25:49.590273 IP 192.168.2.1 > 192.168.2.11: ICMP echo reply, id 24177, seq 1, length 64
20:25:50.590018 IP 192.168.2.1 > 192.168.2.11: ICMP echo reply, id 24177, seq 2, length 64
20:25:51.590027 IP 192.168.2.1 > 192.168.2.11: ICMP echo reply, id 24177, seq 3, length 64
20:25:52.590030 IP 192.168.2.1 > 192.168.2.11: ICMP echo reply, id 24177, seq 4, length 64

Try to ping other IPs

Pinging other IPs in the 192.168.2.0/24 network segment failed, and pinging a public IP also failed:

dev@debian:~$ ping -c 1 -I veth0 192.168.2.2
PING 192.168.2.2 (192.168.2.2) from 192.168.2.11 veth0: 56(84) bytes of data.
From 192.168.2.11 icmp_seq=1 Destination Host Unreachable

--- 192.168.2.2 ping statistics ---
1 packets transmitted, 0 received, + 1 errors, 100% packet loss, time 0ms

dev@debian:~$ ping -c 1 -I veth0 baidu.com
PING baidu.com (111.13.101.208) from 192.168.2.11 veth0: 56(84) bytes of data.
From 192.168.2.11 icmp_seq=1 Destination Host Unreachable

--- baidu.com ping statistics ---
1 packets transmitted, 0 received, + 1 errors, 100% packet loss, time 0ms

Judging from the packet capture, it is the same as the first situation above where veth1 is not configured with an IP, and no one handles the ARP request.

dev@debian:~$ sudo tcpdump -i veth1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth1, link-type EN10MB (Ethernet), capture size 262144 bytes
02:25:23.223947 ARP, Request who-has 192.168.2.2 tell 192.168.2.11, length 28
02:25:24.224352 ARP, Request who-has 192.168.2.2 tell 192.168.2.11, length 28
02:25:25.223471 ARP, Request who-has 192.168.2.2 tell 192.168.2.11, length 28
02:25:27.946539 ARP, Request who-has 123.125.114.144 tell 192.168.2.11, length 28
02:25:28.946633 ARP, Request who-has 123.125.114.144 tell 192.168.2.11, length 28
02:25:29.948055 ARP, Request who-has 123.125.114.144 tell 192.168.2.11, length 28

Conclusion

As can be seen from the above introduction, the data packets from the veth0 device will be forwarded to veth1. If the destination address is the IP of veth1, it can be processed by the protocol stack. Otherwise, it will not even pass the ARP level. IP forward Nothing is used, so without the help of other virtual devices, such data packets can only circulate in the local protocol stack and cannot go to eth0, that is, they cannot be sent to the outside network.

The next article will introduce the network bridge under Linux, and then the veth device will be useful.

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. Cloud native entry-level skills treeHomepageOverview 16924 people are learning the system