KVM Architecture of KVM Virtualization Solution Series

Virtualization is the foundation of cloud computing. Before virtualization, only one operating system and one core business program could be installed on a physical host. With virtualization, multiple virtual machines can be run on one physical host. Different operating systems can be installed on the virtual machines and different core business programs can be run. The virtual machines share the CPU, memory, and I/O hardware resources of the physical host. , but logically the virtual machines are isolated from each other.

1. Type 1 and Type 2 virtualization

The physical host implements hardware resource virtualization through a software program called a virtual machine monitor (Hypervisor). According to whether the Hypervisor is directly installed on the hardware or directly on the host operating system, virtualization can be divided into Type 1 virtualization and type 2 virtualization, as shown in Figure 1.

Figure 1. Type 1 virtualization and Type 2 virtualization

Type 1 virtualization, no Linux or Windows conventional operating system is installed on the host machine, the hypervisor is directly installed on the host machine, the virtual machine runs on the hypervisor, and the hypervisor directly controls hardware resources and virtual machines. Typical Type 1 virtualization are Xen and VMware ESXi.

Type 2 virtualization, Linux or Windows conventional operating system is installed on the host, and the Hypervisor is directly installed on the host operating system. management. Typical Type 2 virtualization are KVM, VirtualBox and VMware Workstation.

2. Introduction to KVM virtualization

KVM is a virtual machine monitor built on the basis of hardware-assisted virtualization technology, so before introducing the KVM architecture, I assume that you have a general understanding of hardware virtualization technologies such as CPU, memory, and I/O. If you don’t know much about it, you can go to the Internet to learn about the relevant content, so I won’t repeat it here.

The full name of KVM is Kernel-based Virtual Machine, which is a kernel-based virtual machine. It is an open source full virtualization solution using hardware virtualization technology, and it is also a mainstream Linux virtualization solution in the industry.

2.1. KVM function

KVM is a kernel module kvm.ko of the Linux operating system. By directly loading the kvm.ko module, the Linux kernel is transformed into a virtual machine monitor (Hypervisor), and at the same time, the Linux kernel is used to manage the hardware, so KVM is a typical type 2 Virtualization, its functional framework is shown in Figure 2. Note that Figure 2 is taken from Crisis Encyclopedia. Considering that it is inconvenient to visit Crisis Encyclopedia in China, the access path will not be posted here.

Figure 2. KVM functional framework – original image

The editor translated the English in Figure 2 into Chinese, so that friends who are not good at English can understand it, as shown in Figure 3.

Figure 3. KVM functional framework – Chinese

Before introducing the KVM function, let’s first understand what is a “host machine”, what is a “client machine”, what is a “physical machine”, and what is a “virtual machine”. Meaning on the line. The actual physical machine on which the virtual machine monitor (Hypervisor) runs is called the host machine. A virtual machine virtualized on a virtual machine monitor (Hypervisor) is called a client. So in the approximate case, the host machine is equivalent to a physical machine, and the client machine is equivalent to a virtual machine.

Below we begin to introduce the KVM functional framework. The client is an abstraction on the host operating system, usually an abstraction is a process, so a KVM client corresponds to a Linux process, and each vCPU is a thread under this process, and there is a separate thread for processing IO. Also within a thread group. Therefore, each client on the host is scheduled by the host kernel like an ordinary process, that is, various process scheduling methods of the Linux operating system can be used to realize functions such as permission limitation and priority of different clients.

The hardware devices seen by the client are simulated by QEMU (excluding VT-d pass-through devices). When the client operates on the simulated device, it is intercepted by QEMU and converted into a driver operation on the actual physical device to complete.

2.1.1. QEMU process

Under the KVM virtualization architecture, each client is a QEMU process, and there are as many QEMU processes as there are virtual machines on a host; each vCPU in the client corresponds to an execution thread in the QEMU process; There is only one KVM kernel module in a host computer, and all clients interact with this kernel module.

2.1.2. Live migration

KVM supports live migration, which provides the ability to transfer running clients between hosts without interruption of service. Live migration is transparent to the user, the client remains open, the network connection remains active, and user applications continue to run, but the client is moved to a new host.

In addition to live migration, KVM supports persisting a guest’s current state (snapshot) to disk to allow storing and restoring it later.

2.1.3. Device Drivers

KVM supports hybrid virtualization, in which paravirtualized drivers are installed in the guest operating system, allowing virtual machines to use optimized I/O interfaces instead of emulated devices, thereby providing high-performance I/O for network and block devices. O.

The paravirtualized driver used by KVM is the VirtIO standard developed by IBM and Redhat in conjunction with the Linux community. It is an interface that is independent from the Hypervisor and builds device drivers, allowing multiple Hypervisors to use the same set of device drivers. Enable better interoperability with clients.

At the same time, KVM also supports Intel’s VT-d technology. By passing through the devices on the PCI bus of the host computer to the client computer, the client computer can directly use the native driver to use these devices efficiently. This use requires little hypervisor intervention.

2.1.4. Performance and scalability

KVM also inherits the performance and scalability of Linux. KVM performs well in virtualization performance such as CPU, memory, network, and disk, and most of them are more than 95% of the original system. The scalability of KVM is also very good. It supports clients with up to 288 vCPUs and 4TB RAM. There is no upper limit on the software for the number of clients that can run simultaneously on the host.

2.2. KVM Architecture

The KVM virtualization infrastructure is relatively simple, as shown in Figure 4. KVM is a kernel module kvm.ko of the Linux operating system. By directly loading the kvm.ko module, the Linux kernel is transformed into a virtual machine monitor (Hypervisor). At the same time, KVM uses QEMU to provide simulated hardware for virtual machines.

Figure 4. KVM architecture

2.2.1. KVM kernel module

The KVM kernel module is the core module of KVM virtualization. It consists of two parts in the kernel: one is the processor architecture-independent part called “kvm” module; the other is the processor architecture-related part. If your host It belongs to the Intel architecture, so the “kvm_intel” module is displayed here. If your host machine belongs to the ADM architecture, then the “kvm_amd” is displayed here, as shown below:

root@ubuntu:~# lsmod | grep kvm
kvm_intel 294912 0 # The host is an Intel architecture, displaying the kvm_intel module
kvm 823296 1 kvm_intel # kvm module

In addition to supporting the most common x86 and x86_64 platforms represented by Intel and AMD, KVM also supports non-x86 architecture platforms such as PowerPC, S/390, and ARM.

The KVM module is responsible for managing and scheduling the vCPU and memory of the virtual machine. The main task is to initialize the CPU hardware, turn on the virtualization mode, and then run the virtual machine in the virtual mode, and provide certain support for the operation of the virtual machine.

KVM itself does not implement simulation. It creates a special device file /dev/kvm, and the user space application (QEMU) realizes the creation of vCPU and the address space allocation of virtual memory by accessing the ioctl function of the /dev/kvm interface. In other words, the process of creating and running a virtual machine is the process of cooperation between the user space application (QEMU) and the KVM module. The source code of QEMU calling KVM through the /dev/kvm interface is as follows:

open("/dev/kvm") # Open /dev/kvm device
ioctl(KVM_CREATE_VM) # Create a virtual machine object through KVM_CREATE_VM
ioctl(KVM_CREATE_VCPU) # Create a vCPU object for the virtual machine through KVM_CREATE_VCPU
for (;;) {<!-- -->
    ioctl(KVM_RUN) # Set vCPU to run through KVM_RUN
        switch (exit_reason) {<!-- -->
        case KVM_EXIT_IO:
        case KVM_EXIT_HLT:
    }
}

As for the external device interaction of the virtual machine, if it is a real physical hardware device, it will be managed by the Linux system kernel; if it is a virtual external device, it will be handed over to QEMU for processing.

2.2.2. QEMU user state device emulation

QEMU is a pure software virtualization technology, an open source virtual machine software project, and does not belong to KVM virtualization software. In other words, even without KVM, QEMU can create and manage virtual machines through simulation, but the performance is relatively low because of pure software implementation.

The KVM module itself cannot be used as a hypervisor to simulate a complete virtual machine, and users cannot directly operate the Linux kernel, so other software is needed to do it. QEMU is such a software required by KVM. During the running of the virtual machine, QEMU will enter the kernel by calling the ioctl function of the /dev/kvm interface provided by the KVM module, and the KVM module is responsible for placing the virtual machine in a special mode of the processor. When encountering a virtual machine for I/O operations (peripheral interaction), the KVM module hands it over to QEMU to parse and simulate these devices.

QEMU uses the virtualization function of the KVM module to provide hardware virtualization acceleration for its own virtual machines, thus greatly improving the performance of virtual machines. In addition, the configuration and creation of the virtual machine, the virtual devices that the virtual machine depends on, the user operating environment and interaction when the virtual machine is running, and some special technologies for virtual machines (such as live migration) are all provided by QEMU itself. Achieved.

Therefore, the creation and operation of the KVM virtual machine is a process in which the user-mode QEMU program and the kernel-mode KVM module cooperate with each other. As the core of the entire virtualization environment, the KVM module works in the kernel space and is responsible for CPU and memory scheduling. QEMU works in user space as an emulator and is responsible for virtual machine I/O simulation.

In short, QEMU is a fully functional Hypervisor that undertakes the work of device emulation in the QEMU/KVM software stack.

2.2.3. Storage and virtual disk file formats

Storage and virtual disk files belong to I/O peripherals, and I/O peripherals are mainly responsible for QEMU, so storage and virtual disk files are functional features of QEMU.

KVM supports all storage supported by the Linux operating system, including local disks with IDE, SCSI and SATA, network attached storage NAS (including NFS and SAMBA/CIFS), or supports iSCSI and SAN, etc.

In KVM, the term Image (mirror) is often used to represent a virtual disk, and there are mainly the following three file formats.

raw: The original format, also known as raw format, directly allocates the storage unit of the file system to the virtual machine for use, and adopts the strategy of direct reading and direct writing. The format is simple to implement and does not support features such as compression, snapshots, encryption, and CoW.
qcow2: The image file format introduced by QEMU is also the default format of KVM. The basic unit of qcow2 file storage data is cluster (clusert), each cluster is composed of several data sectors, each data sector is composed, and the size of each data sector is 512 bytes. In qcow2, to locate the cluster of the image file, two address query operations are required. qcow2 determines the size of the occupied space according to the actual needs, and supports more host file system formats.
qed: An improvement of qcow2. The storage, positioning, query methods, and data block size of qed are the same as those of qcow2. Its purpose is to overcome some shortcomings of the qcow2 format and improve performance. Not mature enough.
If you need to use virtual machine snapshots, compression and encryption, you need to select the qcow2 format. For large-scale data storage, you can choose the raw format. The qcow2 format can only increase the capacity, but not reduce it, while the raw format can increase or decrease the capacity.

3. KVM management tool

To realize an operational and maintainable KVM virtualization solution, two problems need to be solved. The first is the implementation of virtualization technology, and the second is the management of cluster virtual machines. The content we just introduced only solves the problem of virtualization technology, and the problem of cluster virtual machine management has not been solved. Therefore, we need a set of KVM management tools to realize the management and operation and maintenance of KVM virtualization.

The KVM module and QEMU components only solve the problem of virtualization technology implementation. In order to make the entire virtualization environment of KVM easy to manage, Libvirt service and KVM management tools developed based on Libvirt are also needed.

3.1. Libvirt Architecture

Libvirt is a collection of software, a set of open source application programming interfaces, daemons and management tools designed to facilitate the management of platform virtualization technology. It provides not only the management of virtual machines, but also the management of virtual networks and storage. Libvirt was originally a set of APIs designed for the Xen virtualization platform, and currently supports other virtualization platforms, such as KVM, ESX, and QEMU. In the KVM solution, Qemu is used for platform simulation, oriented to upper-layer management and operation; Libvirt is used to manage KVM, oriented to lower-layer management and operation. The entire Libvirt architecture is shown in Figure 5.

Figure 5. libvirt architecture and KVM management tools

Libvirt is currently a widely used virtual machine management tool and application program interface, and is already the de facto virtualization interface standard. The virtual machine management tools in Figure 5, such as virsh and OpenStack, are implemented based on the libvirt API. Libvirt can not only manage KVM, but also manage other virtualization solutions such as Xen, VMware, VirtualBox, Hyper-V, etc.

Libvirt consists of two parts, one part is the service (the daemon process is named Libvirtd), and the other part is the API. As a server daemon process running on the host, Libvirtd provides local and remote management functions for virtualization platforms and virtual machines. Management tools developed based on Libvirt can manage the entire virtualization environment through Libvirtd services. In other words, Libvirtd acts as a bridge between management tools and virtualization platforms. Libvirt API is a series of standard library files that provide a unified programming interface for multiple virtualization platforms, which means that management tools need to be developed based on the standard interface of Libvirt, and the developed tools can support multiple virtualization platforms.

3.2. KVM virtualization management tool

So far, KVM has a complete set of open source KVM management tools ranging from the virsh command line tool to the OpenStack cloud management platform tool. These tools have different capability levels and identities, as shown in Figure 6.

Figure 6. KVM virtualization management tool

Elementary level, the “KVM + virsh” virtualization solution mainly describes the configuration of each virtual machine through the “.xml” configuration file in /etc/libvirtd/qemu, and then uses virsh Use the command line to manage the virtual machine, and finally use VNC/SPICE to connect to it according to the configured port to simulate the terminal operation. The “KVM + virsh” method is more based on the stand-alone management mode. Stand-alone KVM does not have advanced management features such as data centers and clusters. From the perspective of commercialization, it cannot become a mature virtualization solution.

Intermediate level, the “KVM + virt-manager” virtualization solution mainly creates, compiles and manages virtual machines directly through the desktop graphical tool virt-manager. Use the desktop version of VNC/SPICE to connect to the KVM host, and after entering the “virt-manager” command, the virtual system manager window will pop up automatically to create, compile and manage the virtual machine. The “KVM + virt-manager” method is still based on a stand-alone management model. Stand-alone KVM does not have advanced management features such as data centers and clusters. From a commercial perspective, it cannot become a mature virtualization solution.

Advanced segment, the “KVM + Web management tool” virtualization solution is mainly used for small and medium-sized virtual machine clusters through various lightweight Web GUI tools such as Proxmox VE, WebVirtMgr, Kimchi, and oVirt manage. The Web GUI tools are easy to use and easy to understand, especially the foolish Proxmox VE, which is popular among novices. The “KVM + Web management tool” method can realize small and medium-scale virtual machine cluster management. It has advanced features such as data centers and clusters, and the management interface is very friendly. From the perspective of commercialization, it is already a mature virtual machine. solution.

Super segment, the “KVM + cloud management platform tool” virtualization solution mainly manages all computing resource pools, storage resource pools, and networks in one or more data centers through cloud management platforms such as OpenStack and ZStack Hardware resources such as resource pools can realize large-scale/ultra-large-scale KVM host management. However, the way of cloud management platform tools is very complicated in terms of installation and use, especially OpenStack basically requires a DevOps team to play well. ZStack is relatively simple, but also more complex than lightweight web management tools.

Although there is a gap in function and ease of use compared with the commercial virtualization management tools provided by the old virtualization giant VMware, the entire set of management tools of KVM is API-based and open source. There are still certain advantages in the customization of secondary development.

3.2.1. virsh

virsh is a commonly used command line tool for managing KVM virtualization. For system administrators to perform operation and maintenance on a single host, the virsh command line may be the best choice. Virsh is a virtualization management tool written in C language using libvirt API, and its source code is also in the open source project Libvirt.

3.2.2. virt-manager

vir-manager is a graphical management software for virtual machines, and the underlying part that interacts with virtualization is still operated by calling the libvirt API. In addition to providing the basic functions of virtual machine lifecycle management (including: create, start, stop, take snapshots, live migration, etc.), vir-manager also provides performance and resource usage monitoring, and has built-in VNC and SPICE clients, which is convenient Graphically connect to the virtual client. Vir-manager is a very popular virtualization management software on RHEL, CentOS, Fedora and other operating systems. When the number of managed machines is small, vir-manager is a good choice. Because of its ease of use in graphical operations, it has become the first-choice management software for beginners to learn virtualization operations.

3.2.3. Proxmox VE

The use of Proxmox VE is very simple, very foolish, and is very popular among beginners. Management operations can be completed through the embedded Web GUI, without special installation of management tools or management server nodes based on large databases. The multi-master cluster architecture allows you to manage the entire cluster from any node. The centralized web management interface developed based on the JavaScript framework (ExtJS) not only allows you to control all functions through the GUI interface, but also browses the historical activities and syslog logs of each node, such as virtual machine backup and recovery logs, virtual machine online migration logs, HA activity log etc.

3.2.4. WebVirtMgr

WebVirtMgr is a KVM management platform developed based on libvirt, which provides unified management of host machines and virtual machines. It is different from the virtual machine manager that comes with KVM, and makes KVM management more visual. It is suitable for small KVM application scenarios. For example, for a cluster with 10-200 virtual machines, WebvirtMgr is a good choice.

3.2.5. Kimchi

Kimchi is a KVM management tool based on HTML5. It is designed as a web tool to use KVM and create virtual machines as easily as possible. It manages KVM virtual machines through libvirt.

3.2.6. oVirt

oVirt is the open source version of Red Hat’s virtualization management platform RHEV. By using oVirt to manage KVM virtual machines and networks, enterprises can quickly build a private cloud environment. oVirt is managed based on the Web, and the management interface is very friendly. oVirt is more suitable for small and medium clusters Scale, such as a cluster with thousands of virtual machines, using oVirt is a good choice.

3.2.7. OpenStack

OpenStack is an open source infrastructure as a service (IaaS) cloud computing management platform that can be used to build public and private cloud infrastructure. OpenStack is currently the most widely used and most powerful cloud management platform in the industry. It not only provides rich functions for managing virtual machines, but also has many other important management functions, such as object storage, block storage, network, mirroring, and authentication. , orchestration services, dashboards, and more. OpenStack still uses the libvirt API to manage the underlying virtualization.

3.2.8. ZStack

ZStack is an open source IaaS project founded in China in 2015. Its core system is developed in Java language. The main features of ZStack are simple deployment and upgrade, and scalability (it can manage tens of thousands of physical nodes and support high-concurrency API access) , fast (starting the virtual machine is very fast), the default network is NFV (network function virtualization), full API management function (also provides a Web UI management interface), plug-in system (adding or deleting a feature will not affect core function), etc.

Several common KVM management tools in the KVM software stack are briefly introduced here. For more detailed introductions and usage operations, please pay attention to the subsequent chapters. If you want to know more KVM management tools, please refer to the KVM official website, as follows:
Common KVM management tools: http://www.linux-kvm.org/page/Management_Tools

References:
“KVM Combat – Original, Advanced and Performance Tuning”, written by Ren Yongjie and Cheng Zhou;
“OpenStack Cloud Computing Actual Combat”, edited by Zhong Xiaoping and Xu Ning;