Linux inter-process semaphore communication IPC (sem), mmap file memory mapping (inter-process file sharing)

1. Semaphore

A semaphore is a primitive used to provide a means of synchronization between different processes or between different threads from a given process.

1.1. Selection of Posix semaphore

1) Shared by each thread of a single process, memory-based semaphores can be used.

2) When different processes that are not related to each other need to use semaphores, they usually use named semaphores.

1.2. Persistence of memory-based semaphores

1) If a memory-based semaphore is shared by various threads of a single process (the shared parameter of sem_init is 0), then the semaphore has persistence with the process and disappears when the process terminates.

2) If a memory-based semaphore is shared between different processes (the shared parameter of sem_init is 1), then the semaphore must be stored in the shared memory area, so as long as the shared memory area still exists, the The semaphore will continue to exist.

2. Semaphore classification

3. Features

1) A mutex must always be unlocked by the thread that locked it, but the hanging of the semaphore does not have to be performed by the same thread that performed its wait operation.

2) The semaphore has a state (count value) associated with it, and the hang-up operation of the semaphore is always remembered. When sending a signal to a condition variable, if no thread is waiting on the condition variable, the signal will be lost (that is, if a thread calls pthread_cond_signal, but no thread is blocked in the pthread_cond_wait call at that time, then the signal sent to the corresponding condition variable signal will be lost).

3) When the process holding a semaphore lock terminates without releasing the lock, the kernel does not automatically hang up the semaphore. This is different from record locks, which are automatically released by the kernel when the process holding a record lock terminates without releasing it.

4. Posix semaphore function

sem_open() function //Create a new named semaphore or open an existing named semaphore.

sem_close() function
Close the named semaphore opened by sem_open().

sem_unlink() function
Remove the named semaphore from the system.

sem_wait() function
Wait for the semaphore, and if the value is greater than 0, decrement it by 1 and return immediately. If the value is equal to 0, the calling thread is put to sleep until the value becomes greater than 0, at which time it is decremented by 1 and the function returns.

sem_trywait() function
When the specified semaphore is 0, the calling thread will not be put to sleep. Instead, an EAGAIN error is returned.

sem_post() function
Increments the specified semaphore by 1 and wakes up any threads waiting for the semaphore to become positive.

int sem_getvalue(sem_t *sem, int *sval);
Parameter sem: the semaphore to wait for.
Parameter savl: Save the current value of the specified semaphore. If the semaphore is currently locked, the return value is either 0, or a negative number (the determining value is the number of threads waiting for the semaphore to be unlocked)

sem_init() function
Memory-based semaphore initialization.

sem_destroy() function
Destroy a memory-based semaphore.

5. System V semaphore

Function description

smeget() function
Create a semaphore set or access an existing semaphore set.

semop() function
Operate on semaphores.

2. mmap

1. Introduction to mmap

Mmap is a method of memory mapping files, which maps a file or other object to the address space of the process to achieve a one-to-one mapping between the file disk address and a virtual address in the process virtual address space. After realizing such a mapping relationship, the process can use pointers to read and write this memory, and the system will automatically write back the dirty pages to the corresponding file disk, that is, the file operation is completed without calling read and write. Wait for system call functions. On the contrary, the modification of this area by the kernel space also directly reflects the user space, thus enabling file sharing between different processes.

2. mmap

mmap() must map in units of PAGE_SIZE (page), and memory can only be mapped in units of pages. If you want to map an address range that is not an integer multiple of PAGE_SIZE, you must first perform memory alignment and force it to be a multiple of PAGE_SIZE. mapping.
The mmap operation provides a mechanism that allows user programs to directly access device memory. This mechanism is more efficient than copying data between user space and kernel space. It is commonly used in applications requiring high performance.
Stream-oriented devices cannot perform mmap. The implementation of mmap is related to the hardware.

head File:

 #include <sys/mman.h>

Function declaration:

void *mmap(void *start, size_t length, int prot, int flags, int fd, off_t offset);
int munmap(void *start, size_t length);
  • start: The starting address of the mapping area. When set to 0, it means that the system determines the starting address of the mapping area.

  • length: The length of the mapping area.

  • prot: The desired memory protection flag, which cannot conflict with the file’s opening mode. It is one of the following values, which can be reasonably combined through the OR operation (“|”).

    PROT_EXEC //Page content can be executed
    PROT_READ //Page content can be read
    PROT_WRITE //Page can be written
    PROT_NONE //Page is not accessible

  • flags: Specifies the type of mapping object, whether mapping options and mapping pages can be shared. Its value can be a combination of one or more of the following bits.

    MAP_FIXED //Use the specified mapping start address. If the memory area specified by the start and len parameters overlaps with the existing mapping space, the overlapping part will be discarded. If the specified starting address is not available, the operation will fail.
    //And the starting address must fall on the page boundary.
    MAP_SHARED //Share the mapping space with all other processes mapping this object. Writing to the shared area is equivalent to outputting to a file. The file is not actually updated until msync() or munmap() is called.
    MAP_PRIVATE //Create a copy-on-write private mapping. Writing to the memory area will not affect the original file. This flag and the above flags are mutually exclusive, only one of them can be used.
    MAP_DENYWRITE //This flag is ignored.
    MAP_EXECUTABLE //Same as above
    MAP_NORESERVE //Do not reserve swap space for this mapping. When swap space is reserved, the possibility of modifications to the mapped area is guaranteed. When swap space is not reserved and memory is insufficient, modifications to the mapped area will cause a segment violation signal.
    MAP_LOCKED //Lock the page in the mapping area to prevent the page from being swapped out of memory.
    MAP_GROWSDOWN // Used for the stack to tell the kernel VM system that the mapping area can be expanded downward.
    MAP_ANONYMOUS //Anonymous mapping, the mapping area is not associated with any file.
    MAP_ANON //An alias for MAP_ANONYMOUS, no longer used.
    MAP_FILE //Compatibility flag, ignored.
    MAP_32BIT //Place the mapping area in the lower 2GB of the process address space. It will be ignored when MAP_FIXED is specified. Currently this flag is only supported on x86-64 platforms.
    MAP_POPULATE //Prepare the page table for file mapping through pre-reading. Subsequent accesses to the mapped area are not blocked by page violations.
    MAP_NONBLOCK //Only meaningful when used with MAP_POPULATE. No read-ahead is performed, only page table entries are created for pages that already exist in memory.

  • fd: Valid file descriptor. It is generally returned by the open() function, and its value can also be set to -1. In this case, MAP_ANON in the flags parameter needs to be specified, indicating that anonymous mapping is performed.

  • offset: the starting point of the mapped object content.

3. Return value:

When executed successfully, mmap() returns the pointer of the mapped area, and munmap() returns 0.
On failure, mmap() returns MAP_FAILED [its value is (void *)-1], and munmap returns -1.
EACCES: Access error
EAGAIN: The file is locked, or too much memory is locked
EBADF: fd is not a valid file descriptor
EINVAL: One or more parameters are invalid
ENFILE: The system limit for open files has been reached
ENODEV: The file system where the specified file is located does not support memory mapping
ENOMEM: Out of memory, or the process has exceeded the maximum number of memory maps
EPERM: Insufficient power, operation not allowed
ETXTBSY: Open the file in written mode and specify the MAP_DENYWRITE flag
SIGSEGV: Trying to write to the read-only area
SIGBUS: Trying to access a memory area that does not belong to the process

4. Advantages

1. The file reading operation bypasses the page cache, reducing the number of data copies, replacing I/O reading and writing with memory reading and writing, and improving file reading efficiency.

2. Achieve efficient interaction between user space and kernel space. The respective modification operations of the two spaces can be directly reflected in the mapped area, thereby being captured in time by the other space.

3. Provide a way for processes to share memory and communicate with each other. Regardless of whether they are parent-child processes or unrelated processes, they can map their own user space to the same file or anonymously map it to the same area. In this way, inter-process communication and inter-process sharing can be achieved through respective changes to the mapping area. At the same time, if both process A and process B map area C, when A reads C for the first time, the file page is copied from the disk to the memory through a page fault; but when B reads the same page of C again, although it will also occur Page missing exception, but there is no need to copy the file from the disk, and the file data already saved in the memory can be used directly.

4. Can be used to achieve efficient large-scale data transmission. Insufficient memory space is an aspect that restricts big data operations. The solution is often to use hard disk space to assist operations and supplement the lack of memory. However, it will further cause a large number of file I/O operations, which greatly affects efficiency. This problem can be solved well through mmap mapping. In other words, mmap can play its role whenever disk space needs to be used instead of memory.