1. Network virtualization – QEMU virtual network card

Write in front

Network virtualization was once a technology that only developers of kernel virtualization functions paid attention to. However, with the promotion of cloud computing models and cloud native concepts, the deployment form of cloud services has shifted to virtual machines and containers, both of which rely on network virtualization technology to provide high-performance network functions. Therefore, virtual networks are already cloud environments. Under the mainstream network form. Virtual machines and containers on the cloud have put forward higher requirements for the ease of use, functionality and performance of network virtualization technology.

This article introduces the most classic network virtualization technology-QEMU virtual network card.

What is KVM

KVM (Kernel-based Virtual Machine) is a virtualization technology based on the Linux kernel, which allows multiple virtual machines to be run on a physical host and provides an independent operating system and resources for each virtual machine. KVM is a hardware virtualization technology that utilizes processor virtualization extensions to provide high-performance virtualization solutions.

Here are some key features and working principles of KVM:

Hardware virtualization: KVM utilizes processor features that support virtualization, such as Intel's VT-x and AMD's AMD-V, to achieve hardware virtualization. This allows virtual machines to directly access physical hardware, providing virtualization with near-native performance.

Linux Kernel Module: KVM is a Linux kernel module that adds virtualization functionality to the Linux kernel. This means that KVM virtual machines can run on Linux hosts and are tightly integrated with the Linux kernel, providing better performance and management capabilities.

Virtual machine management: KVM can be used in conjunction with various virtual machine management tools, the most common of which is libvirt, which provides a unified interface to manage the creation, configuration, and monitoring of KVM virtual machines. Administrators can use these tools to easily manage multiple virtual machine instances.

Multi-operating system support: KVM supports a variety of different operating systems, including Linux, Windows, BSD, etc. This makes it a versatile virtualization solution that can meet the needs of various application scenarios.

Performance: Since KVM is a hardware virtualization technology, it generally provides near-native performance virtualization. This makes it suitable for applications and workloads that require high performance.

Question

1.1 When using the QEMU virtual network card, how are the network cards and networks used in the guest system? Is it aware of the existence of virtualized devices, and does it need to use different configurations than when running on a physical machine? Do I need to modify the kernel?

1.2 How to detect and load the network card in the guest system? How is QEMU implemented to support this functionality?

1.3 How to send and receive messages through the network card in the guest system? How is QEMU implemented to support this functionality?

1.4 How do the messages sent and received by the guest system flow between QEMU and the host system? What links are used to realize the interaction between guest and host, guest and external network?

1.5 What are the advantages and disadvantages of using QEMU virtual network card to support virtual machine network functions?

QEMU

QEMU is a widely used virtual machine manager (VMM) on the LINUX platform. It can simulate a variety of hardware architectures and devices, and supports the simulation of running multiple different operating systems. Among them, QEMU supports full virtualization of specific network card devices (such as e1000), so that the guest operating system and kernel can use the original network card driver to complete network operations without any modifications.

QEMU-KVM architecture diagram (from wiki.qemu.org)
For the principle of QEMU simulating IO devices, you can refer to QEMU devices and High-level introduction to virtualization’s low-level. Generally speaking, through instruction translation or based on KVM and CPU hardware virtualization, QEMU can interrupt the guest’s execution flow and simulate the impact of the guest operation on the device when the guest operates the device, that is, when executing instructions that access the device address space or registers. Then resume guest execution.

QEMU virtual network card

As a network IO device, the network card’s simulation implementation in QEMU is the same as other IO devices. Through QEMU simulation, the usage and experience of using a virtual network card in the guest system are exactly the same as on a physical machine (except for performance), and there is no need to modify the kernel driver or configuration.

The guest system’s perception and loading of the network card are the same as the simulation of other devices. The perception of the network card comes from QEMU’s simulation of the PCI/PCIE bus device, and the loading of the network card and the sending and receiving of packets are simulations of the network card device.

QEMU implements the simulation of a variety of classic network card devices, such as e1000, rtl8139, i82559c, etc. The logic of network card devices and drivers is similar. Here, we will take the network message sending and receiving of e1000 as an example to analyze the logical steps for implementing network rx/tx operations in the guest. Since mainstream x86 architecture CPUs already support CPU hardware virtualization, such as Intel’s VT-x and AMD’s AMD-V, only the QEMU-KVM scenario will be discussed here.

TX:

The guest kernel writes the contents of the tx message to skb
The guest kernel writes the TX message information (such as the DMA address and length of the data, etc.) into the descriptor ring
The guest kernel modifies the tx queue register of the network card and notifies the network card to process tx messages.
Trigger VM Exit, KVM handles VM Exit event
KVM finds that it is an IO operation and returns to the user mode QEMU process (the qemu process returns from ioctl)
QEMU handles IO operations and forwards messages based on the descriptor content.
QEMU updates virtual network card status and DMA memory
QEMU resumes guest execution (VM Entry) through the KVM interface

RX:

QEMU receives rx message
QEMU writes the rx message content into the DMA address space specified by the guest
QEMU writes rx message information into the descriptor ring
QEMU updates rx queue register (virtual)
QEMU injects interrupts into the guest
The guest handles the interrupt and updates the descriptor ring and registers. QEMU will fall into QEMU again here.
The above is just a simplified process. The logic of guest processing packets is the same as that of the physical machine. You can refer to the network packet processing process under Linux system. For the principle of interrupt injection, please refer to High-level introduction to virtualization’s low-level.

Through the above process, messages sent by the guest can be forwarded by QEMU to the host kernel or network, and messages sent to the guest from the host or network can also be forwarded by QEMU to the guest virtual network card.

QEMU Virtual Network

We have already introduced how QEMU simulates the guest’s network card behavior, but there is still a question, that is, how do packets flow between QEMU and the host kernel/network card? After all, the object that the guest needs to communicate with is not QEMU itself. This part of the functionality is called the backend of the virtual network.

There are two main network backend mode implementations supported by QEMU: SLIRP (user) and TAP.

SLIRP mode is a backend implemented in user mode. In this mode, QEMU parses the network messages sent and received by the guest in user mode and forwards them to other guests and hosts. At the same time, SLIRP also supports NAT address translation. This requires implementing a TCP/IP protocol stack in SLIRP mode. The author has not read the implementation of SLIRP, nor has I found any article detailing its principles. According to some information found, when SLIRP communicates with the host or external network, it is not implemented by forwarding messages through raw sockets, but by translating the behavior of guest messages into ordinary socket operations, such as guest After sending a SYN packet, SLIRP calls a connect syscall to the target address. This implementation sounds weird, but SLIRP itself comes from very old protocol simulation software, so it is understandable. However, this implementation function must be limited. If the guest sends some less-standard messages or messages that the SLIRP protocol stack cannot understand or process, then the network function must not operate normally.

The TAP mode is easier to understand. In this mode, QEMU creates a tap device for each guest and directly sends and receives packets to this tap device. Tap devices can be connected to a bridge to implement more complex routing functions.

In fact, after the guest message reaches the QEMU user mode logic, the backend can be implemented in many ways, such as OVS or OVS-DPDK. This part of the function actually has little to do with virtualization, but more with the realization of network packet processing and forwarding capabilities. This part of the technology will not be discussed further in the future.

summary
This article analyzes QEMU’s classic network virtualization implementation: physical network card virtualization, and QEMU’s classic backend implementation: SLIRP and TAP. Through the analysis of this article, you should be able to answer the four questions raised at the beginning of the article:

  1. When using the QEMU virtual network card, how are the network cards and networks used in the guest system? Is it aware of the existence of virtualized devices, and does it need to use different configurations than when running on a physical machine? Do I need to modify the kernel?

     The answer is no, the guest does not need to perceive or modify the network usage at all.
    
  2. How to detect and load the network card in the guest system? How is QEMU implemented to support this functionality?

     Realize the guest's PCI and network card device access operations through interception and simulation, so that the guest can obtain the same operation results as on the physical machine.
    
  3. How to send and receive packets through the network card in the guest system? How is QEMU implemented to support this functionality?

     The guest's network card operation is also implemented through interception and simulation, and the external messages sent to the guest are also handed over to the virtual machine for processing and interrupts are injected through simulation.
    
  4. How do the messages sent and received by the guest system flow between QEMU and the host system? What links are used to realize the interaction between guest and host, guest and external network?

     Forward the packet to the host or external network through SLIRP or TAP mode.
    
  5. What are the advantages and disadvantages of using QEMU virtual network card to support virtual machine network functions?

     The advantage of QEMU's virtual network card method for implementing network functions is that it is transparent to the guest system and does not require any modifications to the guest system. The virtual network card can be operated using the physical network card driver to complete network communication.
    

There are two main disadvantages of this implementation. First, it is necessary to use software to simulate the functions of various real physical hardware network cards. Since the functions of the hardware network card are complex and lack standards, the code for QEMU’s simulated network card is also complex and changeable, and there may be inconsistencies between the software-simulated network card behavior and the hardware, resulting in guest There are differences between the network behavior in the system and the host.

Another more obvious drawback is performance. According to the previous analysis, each time the network card register or IO address is read or written, it will cause the guest application to switch back and forth to the QEMU process, and a network read and write operation may generate multiple network card IO operations, which results in virtualization when using the QEMU virtual network card. The network performance of the machine is limited by the speed of context and guest/host switching, which is bound to be far lower than the performance level of the physical machine.

Original link: https://blog.csdn.net/dillanzhou/article/details/120169734