Linux inter-process communication pipes and memory mapping

1. Concepts related to inter-process communication

1. What is inter-process communication

In the Linux environment, the virtual address space of each process is independent of each other, and each process has a different user address space. The global variables of any process cannot be accessed in another process, so processes cannot directly access each other. If you want to exchange data, you must go through the kernel and open a buffer in the kernel. Process 1 first copies the data from user space to the kernel buffer, and then process 2 reads the data from the kernel buffer. This mechanism provided by the kernel is called Inter-process communication (IPC, InterProcessCommunication).

Illustration:

2. Methods of inter-process communication

To complete data transfer between processes, you need to use some special methods provided by the operating system, such as files, pipes, signals, shared memory, message queues, sockets, named pipes, etc. With the rapid development of computers, some methods have been eliminated or abandoned due to their own design flaws. Common communication methods today include:

Pipes (easiest to use)
Signals (minimal overhead)
Shared mapping area (no blood relationship)
local socket (most stable)

2. Pipe pipe

1. The concept of pipeline

Pipes are the most basic IPC communication mechanism, which can also be called anonymous pipes. They are used to complete data transfer between processes that are related by blood. A pipe is created by calling the pipe function.

Pipeline features:
- The essence of a pipeline is actually a kernel buffer.
- Referenced by two file descriptors, one representing the read end and one representing the write end.
- Specifies that data flows into the pipe from the write end of the pipe and flows out of the pipe from the read end of the pipe.
- When both processes terminate, the pipe disappears automatically.
- Both the write and read ends of the pipe are blocking by default.

2. Principle of pipeline

The essence of the pipeline is a kernel buffer, which is implemented internally using a ring queue.
The default buffer size is 4k, you can use the ulimit -a command to view the size.
During actual operation, the buffer size will be adjusted appropriately according to data pressure.

3. Limitations of pipelines

Once the data is read, it disappears from the pipeline and cannot be read repeatedly.
Data can only flow in one direction within a pipeline. If two-way flow is to be achieved, two pipelines must be used.
Pipes can only be used between processes that are related by blood.

4. Create pipe pipe function

Function description: Create a pipeline
Function prototype: int pipe(int fd[2]);
Function parameters: If the function call is successful, f[0] stores the read-side file descriptor of the pipe, and fd[1] stores the write-side file descriptor of the pipe.
Function return value:
- Success: return 0
- Failure: returns -1 and sets errno value

Notice:

After the function call is successful, it returns the file descriptors of the reading end and the writing end, where fd[0] is the reading end and fd[1] is the writing end. Reading and writing data into the pipeline is performed using these two file descriptors. Reading The essence of writing a pipe is to operate the kernel buffer.

5. Parent-child processes use pipes to communicate

After the pipe is successfully created, the process that created the pipe (parent process) controls both the read and write ends of the pipe. So how to implement parent-child inter-process communication?
After a process creates a pipe by pipe(), it usually forks() a child process, and then communicates between the parent and child processes through the pipe. Therefore, it can be seen that as long as there is a blood relationship between two processes, where the blood relationship has the same ancestor, they can communicate through pipelines. The parent and child processes have the same file descriptors and point to the same pipe. Other unrelated processes cannot obtain the two file descriptors generated by pipe() and cannot use the same pipe. communication.
Steps to use pipe communication between parent and child processes:
- Step 1: The parent process creates the pipeline:
- Step 2: The parent process forks the child process:
- Step 3: The parent process closes fd[0] (reading end), and the child process closes fd[1] (writing end):

Summary of creation steps:

1. The parent process calls the pipe function to create a pipe and obtains two file descriptors fd[0] and fd[1], which point to the read end and write end of the pipe respectively.

2. When the parent process calls fork() to create a child process, the child process will also have two file descriptors pointing to the same pipe.

3. The parent process closes the read end of the pipe, and the child process closes the write end of the pipe. In this way, the parent process writes data on the writing end of the pipe, and the child process reads the data on the reading end of the pipe, thereby realizing communication between the parent and child processes.

Use pipes to complete parent-child inter-process communication

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>

int main()
{<!-- -->
//Create pipeline
//int pipe(int pipefd[2]);
int fd[2];
int ret = pipe(fd);
if(ret<0)
{<!-- -->
perror("pipe error");
return -1;
}

//Create child process
pid_t pid = fork();
if(pid<0)
{<!-- -->
perror("fork error");
return -1;
}
else if(pid>0) //parent process
{<!-- -->
//Close the reading end
close(fd[0]);
sleep(5);
write(fd[1], "hello world", strlen("hello world"));

wait(NULL);
}
else //child process
{<!-- -->
//Close the write end
close(fd[1]);
\t\t
char buf[64];
memset(buf, 0x00, sizeof(buf));
int n = read(fd[0], buf, sizeof(buf));
printf("read over, n==[%d], buf==[%s]\
", n, buf);
\t
}

return 0;
}

Communication between parent and child processes, implementing ps aux | grep bash

Using the execlp function and the dup2 function

//Use pipe to complete ps aux | grep bash operation
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>

int main()
{<!-- -->
//Create pipeline
//int pipe(int pipefd[2]);
int fd[2];
int ret = pipe(fd);
if(ret<0)
{<!-- -->
perror("pipe error");
return -1;
}

//Create child process
pid_t pid = fork();
if(pid<0)
{<!-- -->
perror("fork error");
return -1;
}
else if(pid>0)
{<!-- -->
//Close the reading end
close(fd[0]);

//Redirect standard output to the write end of the pipe
dup2(fd[1], STDOUT_FILENO);
\t\t
execlp("ps", "ps", "aux", NULL);

perror("execlp error");
}
else
{<!-- -->
//Close the write end
close(fd[1]);
\t
//Redirect standard input to the read end of the pipe
dup2(fd[0], STDIN_FILENO);

execlp("grep", "grep", "--color=auto", "bash", NULL);

perror("execlp error");
}

return 0;
}

Schematic diagram

6. Pipe reading and writing behavior

Read operation
- There is data
  - read normal reading, returns the number of bytes read
- no data
  - All writes are closed
    - read unblocks and returns 0, which is equivalent to reading the file to the end.
  - The writing side is not all closed
    - read blocking
write operation
- Close all readers
  - The pipe breaks, the process terminates, and the kernel sends the SIGPIPE signal to the current process.
- Not all readers are closed
  - The buffer is full
    
    write blocks
  - The buffer is not full
    
    continue writing

7. Set the pipeline to be non-blocking

By default, the read end and write end of the pipeline are blocked. If you want to set the read end or write end to be non-blocking, you can refer to the following three steps:

int flags = fcntl(fd[0], F_GETFL, 0);
flags |= O_NONBLOCK;
fcntl(fd[0], F_SETFL, flags);

If the read end is set to non-blocking:
- The write end is not closed and there is no data to read in the pipe, then read returns -1
- If the write end is not closed and there is data to read in the pipe, read returns the actual number of bytes read.
- The write end has been closed and there is data to read in the pipe, then read returns the actual number of bytes read.
- The write end has been closed and there is no data to read in the pipe, then read returns 0

8. Check the size of the buffer in the pipeline

Order:

ulimit -a
function

long fpathconf(int fd, int name);

printf(“pipe size == [%ld]\
”,fpathconf(fd[0],_PC_PIPE_BUF));

printf(“pipe size == [%ld]\
”,fpathconf(fd[1],_PC_PIPE_BUF));

Exercise: Set the read end of the pipe to non-blocking and check the pipe buffer size

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>
#include <fcntl.h>

int main()
{<!-- -->
//Create pipeline
//int pipe(int pipefd[2]);
int fd[2];
int ret = pipe(fd);
if(ret<0)
{<!-- -->
perror("pipe error");
return -1;
}
printf("pipe size==[%ld]\
", fpathconf(fd[0], _PC_PIPE_BUF));
printf("pipe size==[%ld]\
", fpathconf(fd[1], _PC_PIPE_BUF));

//close(fd[0]);
//write(fd[1], "hello world", strlen("hello world"));

//Close the write end
close(fd[1]);

//Set the read end of the pipe to be non-blocking
int flag = fcntl(fd[0], F_GETFL);
flag |= O_NONBLOCK;
fcntl(fd[0], F_SETFL, flag);

char buf[64];
memset(buf, 0x00, sizeof(buf));
int n = read(fd[0], buf, sizeof(buf));
printf("read over, n==[%d], buf==[%s]\
", n, buf);

return 0;
}

3. FIFO

1. The concept of FIFO

FIFO is often called a named pipe to distinguish between pipes. Pipes can only be used for inter-process communication with “blood relationships”. But through FIFO, unrelated processes can also interact with data.

FIFO is one of the basic file types in Linux (the file type is p, you can view the file type through ls -l). However, the FIFO file has no data blocks on the disk, the file size is 0, and is only used to identify a channel in the kernel. The process can open this file for read/write, which is actually reading and writing the kernel buffer, thus realizing inter-process communication.

2. Create a pipeline

Method 1 Use the command mkfifo
- Command format: mkfifo pipe name
  
  For example: mkfifo myfifo
Method 2 uses the function mkfifo()
- Function prototype: int mkfifo(const char *pathname, mode_t mode);
When a FIFO is created, you can use the open function to open it. Common IO functions can be used for FIFO. Such as: close(), read(), write(), unlink(), etc.

FIFO strictly follows first in first out (first in first out). Reading from FIFO always returns data from the beginning, and writing to them adds data to the end. They do not support file location operations like lseek(). Because it is equivalent to a container queue, it can only be first in, first out, and cannot be inserted and other operations.

3. Use FIFO to complete communication between two processes

Schematic diagram

Idea:
- Process A:
  1. Create a FIFO file: myfifo
  2. Call the open function to open the myfifo file
  3. Call the write function to write a string such as: “hello world” (in fact, the data is written to the kernel buffer, and the myfifo file size is 0)
  4. Call the close function to close the myfifo file
- Process B:
  1. Call the open function to open the myfifo file
  2. Call the read function to read the file content (actually reading data from the kernel buffer)
  3. Print the content read
  4. Call the close function to close the myfifo file

accomplish:

-Process A

//fifo completes the test of communication between two processes
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/stat.h>
#include <fcntl.h>

int main()
{<!-- -->
//Create fifo file
//int mkfifo(const char *pathname, mode_t mode);
int ret = access("./myfifo", F_OK);
if(ret!=0)
{<!-- -->
ret = mkfifo("./myfifo", 0777);
if(ret<0)
{<!-- -->
perror("mkfifo error");
return -1;
}
}

\t//open a file
int fd = open("./myfifo", O_RDWR);
if(fd<0)
{<!-- -->
perror("open error");
return -1;
}

//Write fifo file
int i = 0;
char buf[64];
while(1)
{<!-- -->
memset(buf, 0x00, sizeof(buf));
sprintf(buf, "%d:%s", i, "hello world");
write(fd, buf, strlen(buf));
sleep(1);

i + + ;
}

//Close file
close(fd);

//getchar();

return 0;
}

-Process B

//fifo completes the test of communication between two processes
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/stat.h>
#include <fcntl.h>

int main()
{<!-- -->
//Create fifo file
//int mkfifo(const char *pathname, mode_t mode);
//Determine whether the myfofo file exists, if not, create it
int ret = access("./myfifo", F_OK);
if(ret!=0)
{<!-- -->
ret = mkfifo("./myfifo", 0777);//The first 0 represents octal
if(ret<0)
{<!-- -->
perror("mkfifo error");
return -1;
}
}

\t//open a file
int fd = open("./myfifo", O_RDWR);
if(fd<0)
{<!-- -->
perror("open error");
return -1;
}

//Read fifo file
int n;
char buf[64];
while(1)
{<!-- -->
memset(buf, 0x00, sizeof(buf));
n = read(fd, buf, sizeof(buf));
printf("n==[%d], buf==[%s]\
", n, buf);
}

//Close file
close(fd);

//getchar();

return 0;
}

Notice:

If the myfifo file is created in process A, if process B is started first, an error will be reported.

The reason is that if process B is called first, the myfifo file will not be created. If it is read directly, the file will not be found.

solve:

In process A and process B, first determine whether there is a myfifo file. If not, create it first, and then perform the open operation.

//Determine whether the myfofo file exists, if not, create it
int ret = access(“./myfifo”, F_OK);
if(ret!=0)
{
ret = mkfifo(“./myfifo”, 0777);//The first 0 represents octal
if(ret<0)
{
perror(“mkfifo error”);
return -1;
}
}

4. Memory mapping area

1. The concept of storage mapping area

Memory-mapped I/O maps a disk file to a buffer in storage space. Reading data from the buffer is equivalent to reading the corresponding bytes in the file; writing data to the buffer will write the data to the file. In this way, you can use (address) pointers to complete IO operations without using read/write functions.

Using the storage mapping method, you should first notify the kernel to map a specified file to the storage area. This mapping work can be achieved through the mmap function.

Schematic diagram

2. mmap function

Function:
- Create storage mapping area
Function prototype:
- void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
Function parameters:
- addr: Specifies the starting address of the mapping, usually set to NULL, specified by the system
- length: the length of the file mapped to memory
- prot: the protection method of the mapping area, the most commonly used:
  - Read: PROT_READ
  - Write: PROT_WRITE
  - Read and write: PROT_READ | PROT_WRITE
- flags: characteristics of the mapping area, which can be:
  - MAP_SHARED: Data written to the mapped area will be written back to the file and allowed to be shared by other processes mapping the file.
  - MAP_PRIVATE: The write operation to the mapping area will produce a copy-on-write of the mapping area, and the modifications made to this area will not be written back to the original file.
- fd: The file descriptor returned by open, representing the file to be mapped
- offset: The offset from the beginning of the file, which must be an integer multiple of 4k, usually 0, which means mapping starts from the beginning of the file.
Function return value:
- Success: Return the first address of the created mapping area
- Failure: MAP_FAILED macro

3. munmap function

Function:
- Release the storage mapping area created by the mmap function
Function prototype:
- int munmap(void *addr, size_t length);
Function parameters:
- addr: the first address of the mapping area returned successfully by calling the mmap function
- length: the size of the mapping area, the second parameter of the mmap function
Function return value:
- Success: return 0
- Failure: Return -1, set errno value

4. Things to note when using the mmap function

The process of creating a mapping area implies a read operation of the mapping file, reading the file content into the mapping area.
When MAP_SHARED, it must be required: the permissions of the mapping area <= the permissions of the file opening (for the protection of the mapping area). MAP_PRIVATE doesn't matter, because the permissions in mmap are memory restrictions.
The release of the mapping area has nothing to do with file closing. As long as the mapping is successfully established, the file can be closed immediately.
Special note: When the mapping file size is 0, the mapping area cannot be created. Therefore, the file used for mapping must have an actual size; bus errors often occur when mmap is used, usually due to the size of the shared file storage space.
The address passed in munmap must be the address returned by mmap. And pointer ++ operations cannot be performed.
The file offset must be 0 or an integer multiple of 4k.
The error rate of mmap when creating a mapping area is very high. Be sure to check the return value to ensure that the mapping area is successfully created before proceeding with subsequent operations.

5. Summary of mmap function usage

The first parameter is written as NULL and is allocated by the system.
The second parameter is the file size to be mapped > 0
The third parameter: PROT_READ, PROT_WRITE
The fourth parameter: MAP_SHARED or MAP_PRIVATE
The fifth parameter: the file descriptor corresponding to the opened file
The sixth parameter: 0 or an integer multiple of 4k

6. Questions related to mmap function

Can I O_CREAT a new file when opening to create a mapping area?

Answer: No, the mapped file size must be greater than 0, unless you use the write function to write data after creation and then create the mapping area.
What will happen if O_RDONLY is used for open and PROT_READ | PROT_WRITE is specified for PROT parameter during mmap?

Answer: At this time, the permissions of the mapping area are greater than the permissions of opening the file, and an error will be reported that the permissions are insufficient.
After mmap mapping is completed, the file descriptor is closed. Does it have any impact on mmap mapping?

Answer: There is no impact. Once the mapping area is established, the file descriptor can be closed.
What happens if the file offset is 1000?

Answer: 1000 is not an integer multiple of 4k and is an invalid parameter.
What happens to mem out of bounds operations?

Answer: Error reported, illegal memory was accessed
If mem ++ , can munmap succeed?

Answer: It will not succeed. Mem++ is not the address of the mapping area. When using munmap, the address must be the address of the mapping area.
Under what circumstances will the mmap call fail?

Answer: Mapping file = 0, the mapping area permission is greater than the opening file permission, and the file offset is not 0 or an integer multiple of 4k
What happens if the return value of mmap is not checked?

Answer: Establishing the mapping area may fail and MAP_FAILED will be returned, and subsequent operations will also go wrong. Whenever the return value is a pointer, it must be checked.

7. Practice

Exercise 1: Use mmap to complete inter-process communication without blood relationship.

Idea:

Both processes open the same file, and then call the mmap function to create a storage mapping area, so that the two processes share a storage mapping area.

write

//Use mmap function to complete communication between two unrelated processes
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/mman.h>

int main()
{<!-- -->
//Use the mmap function to create a shared mapping area
//void *mmap(void *addr, size_t length, int prot, int flags,
    // int fd, off_t offset);
int fd = open("./test.log", O_RDWR);
if(fd<0)
{<!-- -->
perror("open error");
return -1;
}

int len = lseek(fd, 0, SEEK_END);

//Create shared mapping area
void * addr = mmap(NULL, len, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
if(addr==MAP_FAILED)
{<!-- -->
perror("mmap error");
return -1;
}
\t
memcpy(addr, "0123456789", 10);

return 0;
}

read

//Use mmap function to complete communication between two unrelated processes
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/mman.h>

int main()
{<!-- -->
//Use the mmap function to create a shared mapping area
//void *mmap(void *addr, size_t length, int prot, int flags,
    // int fd, off_t offset);
int fd = open("./test.log", O_RDWR);
if(fd<0)
{<!-- -->
perror("open error");
return -1;
}

int len = lseek(fd, 0, SEEK_END);

//Create shared mapping area
void * addr = mmap(NULL, len, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
if(addr==MAP_FAILED)
{<!-- -->
perror("mmap error");
return -1;
}

char buf[64];
memset(buf, 0x00, sizeof(buf));
memcpy(buf, addr, 10);
printf("buf=[%s]\
", buf);

return 0;
}

Exercise 2: Use mmap to complete process communication between parent and child

Schematic:

Idea:

Call the mmap function to create a storage mapping area and return the first address of the mapping area ptr
Call the fork function to create a child process, and the child process also has the first address of the mapping area.
The parent and child processes can communicate through the mapping area first address ptr
Call the munmap function to release the storage mapping area

//Use mmap function to complete parent-child inter-process communication
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/mman.h>

int main()
{<!-- -->
//Use the mmap function to create a shared mapping area
//void *mmap(void *addr, size_t length, int prot, int flags,
    // int fd, off_t offset);
int fd = open("./test.log", O_RDWR);
if(fd<0)
{<!-- -->
perror("open error");
return -1;
}

int len = lseek(fd, 0, SEEK_END);

void * addr = mmap(NULL, len, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
//void * addr = mmap(NULL, len, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
if(addr==MAP_FAILED)
{<!-- -->
perror("mmap error");
return -1;
}
close(fd);

//Create child process
pid_t pid = fork();
if(pid<0)
{<!-- -->
perror("fork error");
return -1;
}
else if(pid>0)
{<!-- -->
memcpy(addr, "hello world", strlen("hello world"));
wait(NULL);
}
else if(pid==0)
{<!-- -->
sleep(1);
char *p = (char *)addr;
printf("[%s]", p);
}

return 0;
}

8. Anonymous mapping

Anonymous mapping means that files are not used and have nothing to do with files. The initial size of the anonymous mapping area is 0. Can only be used for communication between processes that are related by blood.

Use the mmap function to create an anonymous map:

mmap(NULL,4096,PROT_READ|PROT_WRITE,MAP_SHARED|MAP_ANONYMOUS,-1,0);

Exercise: Use anonymous mapping to complete parent-child inter-process communication

//Use mmap anonymous mapping to complete parent-child inter-process communication
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/mman.h>

int main()
{<!-- -->
//Use the mmap function to create a shared mapping area
//void *mmap(void *addr, size_t length, int prot, int flags,
    // int fd, off_t offset);
void * addr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0);
if(addr==MAP_FAILED)
{<!-- -->
perror("mmap error");
return -1;
}

//Create child process
pid_t pid = fork();
if(pid<0)
{<!-- -->
perror("fork error");
return -1;
}
else if(pid>0)
{<!-- -->
memcpy(addr, "hello world", strlen("hello world"));
wait(NULL);
}
else if(pid==0)
{<!-- -->
sleep(1);
char *p = (char *)addr;
printf("[%s]", p);
}

return 0;
}