007 Linux fork() function

Foreword

This article will introduce you to the fork function in the form of questions.

Article highlights

Regarding the fork function, this article focuses on solving the following questions

Question 1:
Why is the code before the fork only executed by the parent process, but the code after the fork is executed by both the parent and child processes?
Question two:
1. Since the parent and child processes will execute the same code after fork, what is the significance of the child process?
2. Why do the two return values of fork return the pid of the child process to the parent process and 0 to the child process?
Question three:
1. Why does fork have two return values?
2. How to understand that the same variable will have different values
3. After fork, which process of father and son runs first

Introducing the fork() function

The Linux system is written in C language, so creating a process in Linux actually requires calling a function in C language, that is, using code to create a process is called a system call. The fork function is the basis for multi-process programming in the Linux system. Through fork Function can create a sub-process and then perform different tasks in the sub-process to achieve parallel computing and multi-tasking.

fork analysis one

Use the man command to view fork function information

#include<stdio.h>
#include<unistd.h>
#include<sys/types.h>
int main()
{<!-- -->
    printf("I am a process, pid:%d ppid:%d\
", getpid(), getppid());
    while(1)
    {<!-- -->
        fork();
        printf("i am a process,pid:%d ppid:%d\
",getpid(),getppid());
        sleep(1);
    }
    return 0;
}

Observation results

You can observe that the child process has been created (pid is 5584)

Question: Why is the code before the fork only executed by the parent process, but the code after the fork is executed by both the parent and child processes (the last two lines of the observation result -> both the parent and child processes execute the code), and why is the code executed by the child process? The process won’t execute the parent process’ code from scratch?

The reason is: The child process uses the parent process as a template and copies most of the attributes in the parent process to the child process.
Fork will create a child process, and there will be one more child process in the system. The operating system uses the parent process as a template to create a PCB for the child process, but the child process created has no code and data! ! Currently sharing code and data with the parent process
Therefore, after fork, the parent and child processes will execute the same code. The created child process will not execute the code of the parent process from the beginning, but will start executing from the code after the fork function (most of the attributes of the parent process are copied to the child process. Including the status of the register -> used to record the execution position of the current instruction and save temporary data)

fork analysis two

Observe that the second and third lines are the execution results of the parent process and the child process respectively

Question 2:
1. Since the parent and child processes will execute the same code after fork, what is the significance of the child process?
2. Why do the two return values of fork return the pid of the child process to the parent process and 0 to the child process?

Introduction: When the fork is successful, there will be two different return values, 0 is returned to the child process, and the pid of the child process is returned to the parent process.
Why create a child process: We want the child process to cooperate with the parent process to complete some work, which cannot be solved by a single process.
By judging the return value of fork, the parent and child processes can execute different codes, so that the child process can implement different functions from the parent process. For example, we can play games and listen to music at the same time. These two processes are different processes. Executing

#include<stdio.h>
#include<sys/types.h>
#include<unistd.h>
int main()
{<!-- -->
    printf("I am a parent process, my pid is: %d\
",getpid());
    
    pid_t id = fork();
    
    if(id==0)//code snippet of child process
    {<!-- -->
        while(1)
        {<!-- -->
            printf("I am a child process: pid:%d ppid: %d ret:%d, I am performing a download task\
",getpid(),getppid(),id);
            sleep(1);
        }
    }
    else if(id>0)//Code snippet of parent process
    {<!-- -->
    
        while(1)
        {<!-- -->
            printf("I am the parent process: pid:%d ppid: %d ret:%d, I am performing a playback task\
",getpid(),getppid(),id);
        sleep(1);
        }
    }
    return 0;
}

pid_t id = fork();
Observation results: When fork succeeds, there will be two different return values, 0 is returned to the child process, and the pid of the child process is returned to the parent process.

A parent process can create many child processes, but a child process only corresponds to one parent process, so the fork function will return the ID of the child process to the parent process to facilitate the parent process to manage its child processes

fork analysis three

1. Why does fork have two return values?
2. How to understand that the same variable will have different values
3. After fork, which process of father and son runs first

1. After the child process is created, the child process will share the code and data of the parent process. Obviously return is also a code, so the parent and child processes will execute the return statement. The fork function has two return values< /font>

pid_t id = fork();
Print id address

It is observed that the return value ids of the parent and child processes are different, but the addresses are the same.
How can the same variable and the same address have different contents (the variable ID has different values in the parent process and the child process)?

2. Processes are independent. First of all, the PCBs of the processes will not affect each other when they are running. Obviously, the code itself is only readable, so it does not affect the code, but for the data, The data of parent and child may be different (may be modified)
So how does the system keep a private copy of the data in each process? The answer is copy-on-write. The data will be copied to the PCB when it is needed to be used. However, when the fork return value is assigned to a variable, it is essentially written. , copy-on-write will also occur when returning, so the variable id in the code executed by different processes will obtain different values

3. After the fork, which of the parent and child processes will run first?
In the scheduling queue, the CPU will select a process to run it. Whoever is scheduled first will run first. Therefore, it is uncertain which of the parent and child processes is running now after the fork. This is determined by the scheduling information in the respective process PCB, such as priority. level, algorithm information, etc.

Summary

That’s it for today’s sharing. Later, we will bring you knowledge about process status, priority, process address space, etc. If there are omissions or errors in this article, please point it out!