Linux – process control: create, terminate, wait, replace

Process creation

fork

#include <unistd.h>
pid_t fork(void);

What does the operating system do?

After calling fork, the kernel’s work:

  1. Allocate new memory blocks and kernel data structures to the child process
  2. Copy part of the data structure content of the parent process to the child process
  3. Add child process to system process list
  4. fork returns and starts scheduler scheduling

process = kernel data structures + code and data

The process of creating a child process is: create the kernel data structure of the child process, that is, task_struct + mm_struct + paper_table, etc., inherit the code of the parent process, and copy the data at the same time.

After fork, the operating system will create a new process, copy_process, copy_mem, and inherit all resources of the parent process (including code segments, data segments, stack data, etc.) and register values to the child process. After that, the data independence of the process is guaranteed in the form of copy-on-write.

do_fork

int sys_fork(struct pt_regs *regs)
{<!-- -->
    return do_fork(SIGCHLD, regs->sp, regs, 0, NULL, NULL);
}

asmlinkage int sparc_do_fork(unsigned long clone_flags,
                             unsigned long stack_start,
                             struct pt_regs *regs,
                             unsigned long stack_size)
{<!-- -->
    unsigned long parent_tid_ptr, child_tid_ptr;
    unsigned long orig_i1 = regs->u_regs[UREG_I1];
    long return;
    parent_tid_ptr = regs->u_regs[UREG_I2];
    child_tid_ptr = regs->u_regs[UREG_I4];
    ret = do_fork(clone_flags, stack_start,
              regs, stack_size,
              (int __user *) parent_tid_ptr,
              (int __user *) child_tid_ptr);
    /* If we get an error and potentially restart the system
     * call, we're screwed because copy_thread() clobbered
     * the parent's %o1. So detect that case and restore it
     * here.
     */
    if ((unsigned long)ret >= -ERESTART_RESTARTBLOCK)
        regs->u_regs[UREG_I1] = orig_i1;
    return ret;
}

After fork returns, the PC pointers of the two processes point to the code after the fork function, but in fact the parent and child processes share the entire code. When the child process is running, it takes the value from the pc pointer to get the address of the code to run next time.
Of course, the child process can return the execution flow to the place before the fork through methods such as goto.

An interesting phenomenon

#include <stdio.h>
#include <unistd.h>
#include <cstdlib>

int main()
{<!-- -->
  printf("I am the parent process, I am going to execute fork\\
");
  pid_t id = fork();

again: printf("This is the code after fork, it can only be accessed after the child process jumps\\
");

  if(id == 0)
  {<!-- -->
    printf("I am a child process pid = %d, ppid = %d\\
",getpid(), getppid());
    for(int i = 0; i < 3; i ++ )
      sleep(1);
      
    printf("The child process is about to jump\\
");
    goto again;
    exit(1);
  }
  while(1)
  {<!-- -->
    printf("I am the parent process pid = %d, ppid = %d\\
",getpid(), getppid());
    for(int i = 0; i < 3; i ++ )
    {<!-- -->
      sleep(1);
    }
    break;
  }
  return 0;
}


After Ctrl + C kills the process, the command line starts to provide services, and will continue to print after a while? It looks like Ctrl + C doesn’t work here?
It is not difficult to understand that after ./a.out, the parent process is running from beginning to end, and the parent process will be killed, but the child process is still running. After the parent process hangs, the shell continues to provide services, but the child process is still printing to the monitor. To use the kill command to send a signal to kill the child process, the child process will enter the terminated state, but because the parent process has already hung up, the child process will be adopted by init, and then the resources will be recycled.

Process terminated

Process exit

Scenarios when a process exits: After running, the result is correct; after running, the result is incorrect; before running, it is terminated.
echo $? The command can query the exit code of the corresponding process when the last process is executed. The exit code is an 8-bit unsigned number.

Why does the main function return, who does it return to, and why is it 0?
In order to report its running status to the operating system or other programs.
The return value of the function is 0, indicating success, because there is only one sign of success, and there is no need to know the reason, and there can be countless reasons for failure.
For example, in the development of STM32, the initialization function of the DMP library of MPU6050 is called. If it is unsuccessful, there will be various error reasons, which can be identified by positive integers.

How to terminate the process?

#include <unistd.h>
void _exit(int status);
void exit(int status);

Normal termination

  1. return in the main function
  2. Use exit or _exit function
    Difference: exit will terminate the process and flush the buffer, but _exit will not.
    Calling exit, the system will close all open streams, flush the buffer, and then call _exit.

Although status is an int, the operating system will convert the int to uint because the range of exit codes is 0-255. So when exit(-1), echo $? The result is 255.

abnormal termination

If a signal is received.

Process waiting

Why wait?

If the child process exits, the parent process will become a zombie process if it is not recycled. The operating system cannot kill it, and there may be a memory leak, unless the parent process also ends and the child process is adopted by the operating system. Therefore, the parent process should wait for the child process, reclaim the resources of the child process, and obtain its exit information, such as the exit code.

How does the process wait?

#include <sys/types.h>
#include <sys/wait.h>
pid_t wait(int* status);
pid_t waitpid(pid_t pid, int* status, int options);

By calling these two functions, you can get the exit status of the process from task_struct. The exit status is what remains in the data structure after the process exits.

wait

  • The parameter of wait is an output parameter, which waits for any child process and outputs exit information.

waitpid

  • pid: -1 means any process, and a number greater than 0 means the pid of the process to be waited for.
  • status: The exit status of the process.
  • options: 0 means blocking wait, WNOHANG means non-blocking wait.
  • Return value: greater than 0 means the wait is successful and the process has exited; equal to 0 means the wait is successful but the process has not exited.

Process blocking

When scanf and cin are called, there is no input and it will wait forever. The R state of task_struct changes to S state, and it is transferred from the running queue to the waiting queue.

Non-blocking wait

When waiting, if the event is not ready, return directly, execute other things, and check again later. The detection behavior of calling the blocking waiting interface for multiple times is called Poll detection.

The result of waiting

The composition of the exit code

The lower eight bits of status are the exit signal of the process, and the upper eight bits are the exit code of the process. When the process is killed there will be a core_dump flag.

printf("Exit signal = %d, exit code = %d\\
", (status & amp; 0x7F), (status >> 8) & amp; 0x7F);

For example: when using kill to kill a process, such as kill -9 process, the lower eight bits of the process will receive signal No. 9.
If a process receives a signal, it means that the process is abnormal, and the exit code is useless at this time, because it is not a normal exit, so it only cares about the received signal.

Process substitution

What is process substitution?

After the process was created before, the parent and child processes share code. What if you want the child process to perform other logic?

  1. Process substitution does not create a new process, because process substitution simply replaces the process’s data with the specified executable program. The process PCB has not changed, so it is not a new process, and the pid remains unchanged.
  2. After the process is replaced, if the replacement is successful, the new program will be executed, and the code after the original replacement function will not be executed, because the process replacement is an overlay replacement, and the original code of the process will disappear after the replacement is successful. If the process replacement fails, the code after the original replacement function will be executed.

Principle

Load other programs in the disk into the memory, re-establish the mapping of the page table through copy-on-write, and change the mapping of the data segment and the code segment in the atomic process to the mapping of other programs, so that the code of the parent and child processes and The data is completely separated.

Interface for process replacement

#include <unistd.h>
int execl (const char *path, const char *arg, ...);
int execlp(const char *file, const char *arg, ...);
int execle(const char *path, const char *arg, ..., char * const envp[]);

int execv (const char *path, char *const argv[]);
int execvp(const char *file, char *const argv[]);
int execve(const char *path, char *const argv[], char *const envp[]);

execl

  • path: The program path used to replace: /usr/bin/ls, it can also be a relative path. To execute a program, the system needs to know how to execute the program, with parameters: ls -a -l
  • arg: Indicates variable parameters for passing in options. Use NULL as the terminator to indicate that the parameter has been passed.
  • The return value does not need to be judged, because the replacement will not return if the replacement is successful, and the subsequent code of the original file will be executed if the replacement fails.
  • Call method: execl ("/usr/bin/ls", "ls", "-a", "-l", NULL);

execlp

(p represents the environment variable path)

  • file: The name of the program to be replaced, which can be a relative path, or can be searched in environment variables.
  • arg: Variable parameters, same as above.
  • Return value: no need to judge
  • Call method: execlp(“./hello”, “./hello”, NULL);

execv and execvp

The difference with execl is that variable parameters become string arrays.

char* commands[] = {<!-- -->"./hello", NULL};
std::cout << "I want to replace\\
";
execv("./hello",commands);

execle and execve

You can import environment variables yourself.