Several ways of communication between Linux user mode and kernel mode

Due to the limitation of CPU permissions, the communication between Linux user mode and kernel mode is not as simple as the imaginary use of inter-process communication. Today’s article will take a look at the communication methods between Linux user mode and kernel mode.

When we usually write code, we usually access the kernel space through the system call function in user space. This is the most commonly used way of communication between user mode and kernel mode. (For Linux user mode and kernel mode, please refer to xx)

In addition, there are four other ways:

  • procfs (/proc)
  • sysctl(/proc/sys)
  • sysfs (/sys)
  • netlink socket

procfs(/proc)

procfs is the abbreviation of Process File System, it is essentially a pseudo file system, why is it called a pseudo file system? Because it does not take up external storage space, but only takes up a small amount of memory, it is usually mounted in the /proc directory.

One of the files we see in this directory is actually a kernel variable. Through this directory, the kernel displays its internal information in the form of files, which is equivalent to the /proc directory, which builds a bridge for the interaction between the user state and the kernel state, and the user state reads and writes The files under /proc are to read and write kernel-related configuration parameters.

For example, the common /proc/cpuinfo, /proc/meminfo and /proc/net respectively provide the relevant parameters of CPU, memory and network . In addition, there are many parameters, as follows:

root@ubuntu:~# ls /proc/
1 1143 1345 1447 2 2292 29 331 393 44 63 70 76 acpi diskstats irq locks sched_debug sysvipc zoneinfo
10 1145 1357 148 20 23 290 332 396 442 64 7019 77 asound dma kallsyms mdstat schedstat thread-self
1042 1149 1361 149 2084 2425 291 34 398 45 65 7029 8 buddyinfo driver kcore meminfo scsi timer_list
1044 1150 1363 15 2087 25 3 3455 413 46 66 7079 83 bus execdomains keys misc self timer_stats
1046 1151 1371 16 2090 256 30 35 418 47 6600 7080 884 cgroups fb key-users modules slabinfo tty
1048 1153 1372 17 21 26 302 36 419 5 67 71 9 cmdline filesystems kmsg mounts softirqs uptime
11 1190 1390 18 22 27 31 37 420 518 6749 72 96 consoles fs kpagecgroup mtrr stat version
1126 12 143 182 2214 28 32 373 421 524 68 73 97 cpuinfo interrupts kpagecount net swaps version_signature
1137 1252 1434 184 2215 280 327 38 422 525 69 74 98 crypto iomem kpageflags pagetypeinfo sys vmallocinfo
1141 13 144 190 2262 281 33 39 425 5940 7 75 985 devices ioports loadavg partitions sysrq-trigger vmstat

It can be seen that there are many files represented by numbers. These are actually process files currently running in the system. The numbers represent the process ID (PID). Each file contains all the configuration information of the process, including process status and file descriptors. , memory mapping, etc., we can look at:

root@ubuntu:~# ls /proc/1/
attr/ cmdline environ io mem ns/ pagemap schedstat stat timers
autogroup comm exe limits mountinfo numa_maps personality sessionid statm uid_map
auxv coredump_filter fd/ loginuid mounts oom_adj projid_map setgroups status wchan
cgroup cpuset fdinfo/ map_files/ mountstats oom_score root/ smaps syscall
clear_refs cwd/ gid_map maps net/ oom_score_adj sched stack task/

To sum up, the kernel exposes its own system configuration information through individual files. Some of these files are read-only, some are writable, and some are dynamically changing, such as process files. When an application reads a /proc/ file, the kernel will register this file, and then call a set of kernel functions to process, and copy the corresponding kernel parameters to the user mode space, so that the user can read this file to get Kernel information. A rough diagram is as follows:

sysctl

The sysctl we are familiar with is a Linux command, man sysctl can see its function and usage. It is mainly used to modify the runtime parameters of the kernel, in other words, it can dynamically modify the kernel parameters during the running of the kernel.

It essentially uses file read and write operations to complete the communication between user mode and kernel mode. It uses /proc/sys, a subdirectory of /proc. The difference with procfs is:
procfs mainly outputs read-only data, while most of the information output by sysctl is writable.

For example, it is more common for us to use cat /proc/sys/net/ipv4/ip_forward to obtain whether the kernel network layer allows forwarding IP data packets, through echo 1 > /proc/sys/ net/ipv4/ip_forward or sysctl -w net.ipv4.ip_forward=1 to set the kernel network layer to allow IP packets to be forwarded.

For the same operation, Linux also provides the file /etc/sysctl.conf to allow you to make batch modifications.

sysfs

sysfs is a virtual file system introduced in Linux 2.6. It also uses the file /sys to complete the communication between the user mode and the kernel. Different from procfs, sysfs separates some parts about devices and drivers that were originally in procfs, and presents them to users in the form of a “device tree”.

sysfs can not only read device and driver information from kernel space, but also configure devices and drivers.

Let’s see what’s under /sys:

# ls /sys block bus class dev devices firmware fs hypervisor kernel module power
It can be seen that these files are basically closely related to the equipment and drivers of the computer. You can learn more about these documents by yourself, so I won’t expand too much here.

Learning address: Dpdk/network protocol stack/vpp/OvS/DDos/NFV/virtualization/high performance expert (free subscription, permanent learning)

[Article benefits] Need more DPDK/SPDK learning materials to add group 793599096 (materials include C/C++, Linux, golang technology, kernel, Nginx, ZeroMQ, MySQL, Redis, fastdfs, MongoDB, ZK, CDN, P2P, K8S , Docker, TCP/IP, Coroutine, DPDK, Dachang interview questions, etc.) You can add your own learning and exchange groups by clicking here~

netlink

Netlink is the most commonly used way of communication between Linux user mode and kernel mode. Linux kernel version 2.6.14 began to support it. It is essentially a socket, and the standard API used by conventional sockets is also applicable to it. For example, to create a netlink socket, you can call the following

socket function:

#include <asm/types.h>
#include <sys/socket.h>
#include <linux/netlink.h>

netlink_socket = socket(AF_NETLINK, socket_type, netlink_family);

The flexible method of netlink allows it to be used in the messaging system between the kernel and various user processes, such as routing subsystems, firewalls (Netfilter), ipsec security policies, and so on.

Extension:

The net-tools tool uses procfs (/proc) and ioctl system calls to access and change the kernel network parameter configuration, while iproute2 communicates with the kernel through the netlink socket interface. The former has been eliminated, and the latter has gradually become the standard.

Summary
There are four main ways for Linux user mode and kernel mode to communicate, among which netlink and procfs are the most common ways.

refer to:
https://www.ibm.com/developerworks/cn/linux/l-kerns-usrs/index.html
https://fasionchan.com/blog/2017/06/16/procfs-wei-wen-jian-xi-tong-yuan-li/
https://en.wikipedia.org/wiki/Netlink
Reposted from: https://www.cnblogs.com/bakari/p/10966303.html