Linux Open Source Storage Talk (4)Storage Performance Software Acceleration Library SPDK

In the more than 30 years since the first release of Linux in 1991, storage media, interfaces, and protocols have evolved several generations, and storage capacity and transmission rates have almost followed Moore’s Law. However, people’s pursuit of high performance It is endless. From the perspective of continuously satisfying high performance, there are two ways, one is higher performance hardware equipment, and the other is to reduce the overhead of software processing. Higher performance hard disk equipment depends on the unremitting efforts of hardware manufacturers, software Processing refers to the storage software stack, that is, Linux Storage Stack, see figure,

e1c9ce9cb6ff46c9a818c00a7dc03b2e.png

Linux Storage Stack mainly includes VFS, Block Layer, driver, etc. Because of its general design, Linux Storage Stack is based on a common driver block device driver for different hardware devices, which may not necessarily be able to take advantage of the common performance hard disk. become the bottleneck of high-performance storage.

Provide software processing performance to improve the high-performance requirements of upper-layer applications as a whole. SPDK is a storage performance software acceleration library developed by Intel to solve high-performance access

SPDK

SPDK is a software acceleration library initiated by Intel to accelerate the use of NVMe SSD as back-end storage. SPDK is not a general adaptation solution. It implements a complete IO stack based on user-mode software drivers. Obviously, kernel file systems such as ext4 and brtfs cannot be used, and it does not support portable operating system interfaces. Provides blobfs/blobstore and uses asynchronous read and write methods similar to AIO.

The core of SPDK is a user-mode, asynchronous, polling, and lock-free NVMe driver, and provides functions such as zero copy and high concurrency. The SPDK user mode driver maximizes the performance advantages of NVMe SSD, greatly reduces the delay of NVMe command, improves the IOps of a single CPU core, and forms a cost-effective solution, such as SPDK vhost target, which transparently transmits NVMe SSD with a small performance loss For the qemu virtual machine, such a high-performance solution must be extremely popular in high-performance computing

Application scenarios of SPDK

The better application scenarios of SPDK include:

1. Backend storage applications that provide block device interfaces, such as iSCSI Target, NVMe-OF Target

2. Acceleration of virtualized IO mainly means that QUME/KVM under the Linux system is used as the hypervisor management virtualization scenario, using the vhost interaction protocol to realize the efficient vhost user mode Target based on the shared memory channel. Such as vhost iSCSI/blk/NVMe Target, so as to accelerate the virtio SCSI/blk in the virtual machine, which is the IO driver of the Kernel Native NVMe protocol, and shorten the IO stack in the host OS

3. SPDK accelerates the database engine. By implementing the abstract file class in RocksDB, SPDK’s blobfs/blobstore can be integrated with RocksDB to accelerate the use of the RocksDB engine on NVMe SSD. Its essence is the bypass kernel file system

SPDK NVMe driver

Linux kernel NVMe driver

The design of the NVMe driver in the kernel state is based on versatility, to achieve a general block device driver, and at the same time deeply integrated with other modules of the kernel, some isolation methods are needed, such as semaphores, locks, critical sections, etc. to ensure the uniqueness of operations. This design has good compatibility and maintainability

How does the kernel driver interact with user mode applications? When the kernel driver module is successfully loaded by the kernel, it will be identified as a block device or a character device, and related access interfaces are defined, including management interfaces and data interfaces. These interfaces are directly or indirectly combined with the file system subsystem, provided to the program in the user mode, and initiate control or read/write operations through system calls.

The interaction between user mode applications and kernel drivers is inseparable from the context switching between user mode and kernel mode, as well as the overhead of system calls. See Linux Storage Stack

SPDK NVMe driver

Different from the kernel-mode NVMe driver, the SPDK NVMe driver is designed to reduce software overhead, including context switching and system calls, from a performance perspective. SPDK also supports mechanisms such as UIO (Userspace I/O) and VFIO (Virtual Function I/O) to bypass the kernel I/O stack to access and control high-speed NVMe SSD disks

UIO solves two core problems. First, how to access device memory. Linux provides access by mapping the memory of the physical device to the user state. Second, how to handle interrupts. The interrupt itself needs to be processed in the kernel. UIO completes the most basic interrupt service program processing through a small kernel module. The user mode driver and the kernel module complete the most basic interaction through the /dev/uioX device. At the same time Use sysfs to get the memory map, kernel driver and other information of related devices. The UIO architecture is shown in the figure below

d58808a7425e43b4828ae3e35191955f.png

Compared with UIO, VFIO not only provides the two basic functions that UIO can provide, but also from the perspective of security, it supports DMA remapping and other technologies through IOMMU (like conventional MMU, which requires on-chip support, such as Intel VT-d) , exposing device I/O, interrupts, and DMA to user space so that the device driver framework can be completed in user space

With the support of Linux kernel modules UIO and VFIO, SPDK also introduces asynchronous polling instead of interrupt processing, avoids inter-core cache synchronization and resource lock-free by binding CPU cores, and uses large page memory + hugetlb to optimize page faults. performance degradation of

SPDK user mode application framework

In order to make better use of the performance advantages of the underlying NVMe SSD, in addition to user mode drivers, SPDK also provides a programming framework

e800e0ba73a344d9a68b3dd52db75459.png

In general, the application framework of SPDK can be divided into: 1. Management of CPU core and thread; 2. Efficient communication between threads; 3. I/O processing model; 4. Lock-free mechanism of data path

The principle of SPDK is to use the fewest CPU cores and threads to complete the most tasks. Using DPDK’s EAL library, through the affinity of the CPU core binding function, limit the use of the CPU, run a reactor thread on each core, SPDK A Poller mechanism is provided. The so-called Poller is actually the encapsulation of a user-defined function, which is registered through spdk_poller_register(), and the status of Poller is checked and called in the while loop of Reactor.

SPDK abandons the traditional and inefficient locking method for inter-thread communication. The same thread only executes the resources it manages. SPDK provides an event call mechanism. The essence is that the data structure corresponding to each Reactor maintains an Event event ring. , this ring is a multi-producer but consumer model, and this ring has a lock mechanism guarantee, which is much more efficient than the lock mechanism between threads

SPDK user mode block device layer

44b62ae6b5334e7a9a1f9e2631d327ad.png

The lowest layer of SPDK is the driver layer, which manages physical or virtual devices, and also manages local devices or network devices

The middle layer is a common block layer, which supports different backends and provides a unified interface to the upper layer, including logical volume support, flow control and other storage services. This layer also provides support for Blob (Binary Larger Object) and simple user mode file system BlobFS

The top layer is the protocol layer, including NVMe, SCSI and other protocols, which can be better combined with upper-layer applications

The SPDK application framework adopts the idea of optimization, which is also similarly implemented in the same block layer. Including memory resource allocation, I/O resource pool, Buffer resource pool, etc., not only considering the total global allocation quantity, but also considering the exclusive resources of each CPU core

Build SPDK environment

Download SPDK: https://github.com/spdk/spdk.git, and complete compilation and installation

# download spdk code & amp; make spdk
root@nvme:~# cd /data/github/
root@nvme:/data/github# git clone https://github.com/spdk/spdk.git
root@nvme:/data/github# cd spdk
root@nvme:/data/github# git submodule update --init

# install dependencies
root@nvme:/data/github/spdk# ./scripts/pkgdep.sh

# configure with debug
root@nvme:/data/github/spdk# ./configure --enable-debug

# make & make install
root@nvme:/data/github/spdk# make & amp; & amp; make install 

Welcome to reprint, please indicate the source