Golang os package: process creation and termination, running external commands

The os package and its subpackage os/exec provide methods for creating processes.

Generally, the os/exec package should be used first. Because the os/exec package relies on the key process creation APIs in the os package, for ease of understanding, we first discuss the process-related APIs in the os package. part.

Creation of process

In Unix, creating a process is achieved through the system call fork (and some of its variants, such as vfork, clone). In the Go language, the system call used to create a process under Linux is clone.

Many times, the system calls fork, execve, wait and exit appear together. Here is a brief introduction to these four system calls and their typical usage.

fork: allows one process (parent process) to create a new process (child process). The specific method is that the new child process is almost a replica of the parent process: the child process obtains a copy of the stack, data segment, heap and execution text segment of the parent process. Think of this as splitting the parent process in two.
exit(status): Terminate a process and return all resources (memory, file descriptors, etc.) occupied by the process to the kernel for reallocation. The parameter status is an integer variable representing the exit status of the process. The parent process can use the system call wait() to obtain this status.
wait( & amp;status) has two purposes: first, if the child process has not called exit() to terminate, then wait will suspend the parent process until the child process terminates; Second, the termination status of the child process is returned through the status parameter of wait.
execve(pathname, argv, envp) loads a new program (pathname is pathname, parameter list is argv, environment variable list is envp) into the memory of the current process. This will discard the existing program text segment and recreate the stack, data segment, and heap for the new program. This action is often referred to as executing a new program.

In the Go language, there is no direct encapsulation of the fork system call. Instead, fork and execve are combined into one to provide syscall.ForkExec. If you want to just call fork, you have to implement it yourself through syscall.Syscall(syscall.SYS_FORK, 0, 0, 0).

Process and related methods

os.Process stores information about the process created by StartProcess.

type Process struct {<!-- -->
Pid int
handle uintptr // handle is accessed atomically on Windows
isdone uint32 // process has been successfully waited on, non zero if true
}

Generally, an instance of Process is created through StartProcess, and the function is declared as follows:

func StartProcess(name string, argv []string, attr *ProcAttr) (*Process, error)

It starts a new process using the provided program name, command line arguments, and attributes. StartProcess is a low-level interface. The os/exec package provides a high-level interface. Generally, you should try to use the os/exec package. If an error occurs, the error type will be *PathError.

The parameter attr is a pointer of type ProcAttr, which is used to provide some attributes for StartProcess to create a new process. The definition is as follows:

type ProcAttr struct {<!-- -->
    // If Dir is non-empty, the child process will enter the directory before creating the Process instance. (That is, set to the current working directory of the child process)
    Dir string
    // If Env is non-empty, it will be used as an environment variable for the new process. Must be in the format of the Environ return value.
    // If Env is nil, the return value of the Environ function will be used.
    Env[]string
    // Files specifies open file objects inherited by the new process.
    //The first three bindings are standard input, standard output, and standard error output.
    // Implementations that depend on the underlying operating system may support additional file objects.
    // nil is equivalent to a file object that is closed when the process starts.
    Files[]*File
    // Operating system specific creation properties.
    // Note that setting this field means that your program may execute abnormally or even fail to compile on some operating systems. This can be done by setting it for a specific system.
    //Look at the definition of syscall.SysProcAttr to know the related attributes used to control the process.
    Sys *syscall.SysProcAttr
}

FindProcess can find a running process by pid. The Process object returned by this function can be used to obtain information about the underlying operating system process. On Unix systems, this function always succeeds, even if the process corresponding to pid does not exist.

func FindProcess(pid int) (*Process, error)

Process provides four methods: Kill, Signal, Wait and Release. Among them, Kill and Signal are related to signals, and Kill actually calls Signal and sends SIGKILL Signal forces the process to exit. Regarding signals, subsequent chapters will explain it specifically.

The Release method is used to release the resources associated with the Process object so that they can be reused in the future. This method only needs to be called when it is certain that Wait has not been called. On Unix, the internal implementation of this method simply sets the Process‘s pid to -1.

Let’s focus on the Wait method.

func (p *Process) Wait() (*ProcessState, error)

In the design of multi-process applications, the parent process needs to know when a child process changes state – the child terminates or is stopped by a signal. The Wait method is a technique for monitoring child processes.

The Wait method blocks until the process exits, then returns a ProcessState describing the status of the process and possible errors. The Wait method releases all resources bound to Process. In most operating systems, Process must be a child of the current process, otherwise an error will be returned.

Take a look at the internal structure of ProcessState:

type ProcessState struct {<!-- -->
pid int // The process's id.
   status syscall.WaitStatus // System-dependent status info.
   rusage *syscall.Rusage
}

ProcessState saves information about a process reported by the Wait function. status records the status reason, which can be determined through the method defined by the syscal.WaitStatus type:

Exited(): Whether to exit normally, such as calling os.Exit;
Signaled(): Whether to terminate after receiving an unhandled signal;
CoreDump(): Whether it terminates after receiving an unhandled signal and generates a coredump file, such as SIGABRT;
Stopped(): Whether stopped due to signal (SIGSTOP);
Continued(): Whether to resume due to receipt of signal SIGCONT;

syscal.WaitStatus also provides other methods, such as getting exit status, signal, stop signal and interrupt (Trap) reason.

Because the internal implementation of Wait under Linux uses the wait4 system call, therefore, ProcessState contains rusage, Various resource information used for statistical processes. Under normal circumstances, the information defined in syscall.Rusage is not used. If you need to use it in practice, you can check the Linux system call getrusage for relevant instructions (getrusage (2)).

The internal fields of the ProcessState structure are private. We can obtain some basic information through the methods it provides, such as: whether the process exits, Pid, whether the process exits normally, process CPU time, user time, etc. .

Implement functions similar to the time command in Linux:

package main

import (
"fmt"
"os"
"os/exec"
"path/filepath"
"time"
)

func main() {<!-- -->
if len(os.Args) < 2 {<!-- -->
fmt.Printf("Usage: %s [command]\
", os.Args[0])
os.Exit(1)
}

cmdName := os.Args[1]
if filepath.Base(os.Args[1]) == os.Args[1] {<!-- -->
if lp, err := exec.LookPath(os.Args[1]); err != nil {<!-- -->
fmt.Println("look path error:", err)
os.Exit(1)
} else {<!-- -->
cmdName=lp
}
}

procAttr := & amp;os.ProcAttr{<!-- -->
Files: []*os.File{<!-- -->os.Stdin, os.Stdout, os.Stderr},
}

cwd, err := os.Getwd()
if err != nil {<!-- -->
fmt.Println("look path error:", err)
os.Exit(1)
}

start := time.Now()
process, err := os.StartProcess(cmdName, []string{<!-- -->cwd}, procAttr)
if err != nil {<!-- -->
fmt.Println("start process error:", err)
os.Exit(2)
}

processState, err := process.Wait()
if err != nil {<!-- -->
fmt.Println("wait error:", err)
os.Exit(3)
}

fmt.Println()
fmt.Println("real", time.Now().Sub(start))
fmt.Println("user", processState.UserTime())
fmt.Println("system", processState.SystemTime())
}

// go build main.go & amp; & amp; ./main ls
//Output:
//
// real 4.994739ms
// user 1.177ms
// system 2.279ms

Run external commands

Running external commands can be done through the os package, as in the previous example. However, the Go standard library has encapsulated a more useful package for us: os/exec, which should be used first to run external commands. It wraps the os.StartProcess function for Easily redirect standard input and output, pipe I/O, and make other adjustments.

Find executable programs

The exec.LookPath function searches for executable programs in the directory specified by PATH. If there is / in file, only Search in the current directory. This function returns the full path or a relative path relative to the current path.

func LookPath(file string) (string, error)

If the executable file is not found in PATH, exec.ErrNotFound is returned.

Cmd and related methods

The Cmd structure represents an external command that is being prepared or being executed after calling Run, Output or CombinedOutput , Cmd instances cannot be reused.

type Cmd struct {<!-- -->
    // Path is the path of the command to be executed.
    // This field cannot be empty (it is also the only field that cannot be empty). If it is a relative path, it will be relative to the Dir field.
    // When initialized through Command, LookPath will be called to obtain the complete path when needed.
    Path string
    
    // Args stores the parameters of the command. The first value is the command to be executed (Args[0]); if it is an empty slice or nil, use {Path} to run.
    // Generally, Path and Args should be set by the Command function.
    Args[]string
    
    // Env specifies the environment variable of the process. If it is nil, the environment variable of the current process is used, that is, os.Environ(), which is generally the environment variable of the current system.
    Env[]string
    
    // Dir specifies the working directory of the command. If it is an empty string, it will be executed in the current working directory of the caller's process.
    Dir string
    
    // Stdin specifies the standard input of the process. If it is nil, the process will read from the empty device (os.DevNull)
    // If Stdin is an instance of *os.File, the process's standard input will point directly to this file
    // Otherwise, the data will be read from Stdin in a separate goroutine, and then the data will be passed to the command through the pipe (that is, after reading the data from Stdin, it will be written to the pipe, and the command can read the data from the pipe ). The Wait method will block until the goroutine stops copying data (because of EOF or other errors, or an error on the write end of the pipe).
    Stdinio.Reader
    
    // Stdout and Stderr specify the standard output and standard error output of the process.
    // If either one is nil, the Run method will associate the corresponding file descriptor to the empty device (os.DevNull)
    // If two fields are the same, at most one thread can write at the same time.
    Stdoutio.Writer
    Stderr io.Writer
    
    // ExtraFiles specifies additional open files inherited by the new process, excluding standard input, standard output, and standard error output.
    // If this field is non-nil, element i will become file descriptor 3 + i.
    //
    // BUG: On OS X 10.6, a child process might inherit an unexpected file descriptor.
    // http://golang.org/issue/2603
    ExtraFiles []*os.File
    
    // SysProcAttr provides optional, operating system-specific sys attributes.
    // The Run method will pass it to the os.StartProcess function as the Sys field of os.ProcAttr.
    SysProcAttr *syscall.SysProcAttr
    
    // Process is a low-level process that only executes once.
    Process *os.Process
    
    // ProcessState contains information about an existing process and is only available after calling Wait or Run.
    ProcessState *os.ProcessState
}

Command

Generally, a Cmd instance should be generated through the exec.Command function:

func Command(name string, arg ...string) *Cmd

This function returns a *Cmd that executes the program specified by name using the given arguments. The returned *Cmd only sets the two fields Path and Args.

If name does not contain a path separator, LookPath will be used to obtain the full path; otherwise, name will be used directly. Arguments arg should not contain the command name.

After getting the *Cmd instance, there are generally two ways to write it:

Call Start(), then call Wait(), and then block until the command execution is completed;
Calling Run() will internally call Start() first, and then call Wait();

Start

func (c *Cmd) Start() error

Starts executing the command contained in c, but does not wait for the command to complete before returning. The Wait method will return the exit status code of the command and release related resources after the command is executed. Internally call os.StartProcess and execute forkExec.

Wait

func (c *Cmd) Wait() error

Wait will block until the execution of the command is completed. The command must be executed through Start first.

If the command is executed successfully, there is no problem with stdin, stdout, and stderr data transmission, and the return status code is 0, and the return value of the method is nil; if the command is not executed or fails, an *ExitError type will be returned. error; otherwise the error returned may indicate an I/O problem.

If c.Stdin is not of type *os.File, Wait will wait until the data is copied from c.Stdin to the process’s standard input.

The Wait method will release related resources after the command returns.

Output

In addition to Run() being a simple way of writing Start + Wait, Output() is also Run( ), plus getting the output of external commands.

func (c *Cmd) Output() ([]byte, error)

It requires that c.Stdout must be nil, and internally assigns bytes.Buffer to c.Stdout. After Run() returns successfully, the result of Buffer will be returned (stdout.Bytes()).

CombinedOutput

Output() only returns the result of Stdout, while CombinedOutput combines Stdout and Stderr The output, that is, Stdout and Stderr are both assigned the same bytes.Buffer.

StdoutPipe, StderrPipe and StdinPipe

In addition to the Output and CombinedOutput introduced above to directly obtain the command output results, you can also return io.ReadCloser through StdoutPipe to get the output; the corresponding StderrPipe gets the error message; and StdinPipe can write data to the command.

func (c *Cmd) StdoutPipe() (io.ReadCloser, error)

The StdoutPipe method returns a pipe associated with the command’s standard output after the command Start is executed. The Wait method will close the pipe after the command ends, so there is generally no need to manually close the pipe. But if an error occurs when calling Wait before all the data has been read from the pipe, it must be closed manually.

func (c *Cmd) StderrPipe() (io.ReadCloser, error)

The StderrPipe method returns a pipe associated with the command’s standard error output after the command Start is executed. The Wait method will close the pipe after the command ends. Generally, there is no need to manually close the pipe. But if an error occurs when calling Wait before all the data has been read from the pipe, it must be closed manually.

func (c *Cmd) StdinPipe() (io.WriteCloser, error)

The StdinPipe method returns a pipe associated with the command’s standard input after the command Start is executed. The Wait method will close the pipe after the command ends. The caller can call the Close method to forcefully close the pipe when necessary. For example, the standard input has been closed before the command execution is completed. At this time, the caller needs to explicitly close the pipe.

Because after Wait, the pipe will be closed, so to use these methods, you can only use the combination of Start + Wait, not Run.

Example of executing external commands

As mentioned earlier, there are two ways to run commands after passing the Cmd instance. Sometimes, we not only simply run commands, but also want to control the input and output of commands. Through the API introduction above, there are several ways to control input and output:

After getting the Cmd instance, directly assign values to its fields Stdin, Stdout and Stderr;
Obtain output via Output or CombinedOutput;
Obtain a pipe through a method with the Pipe suffix for input or output;

Direct assignment of `Stdin`, `Stdout` and `Stderr`

func FillStd(name string, arg ...string) ([]byte, error) {<!-- -->
cmd := exec.Command(name, arg...)
var out = new(bytes.Buffer)

cmd.Stdout = out
cmd.Stderr = out

err := cmd.Run()
if err != nil {<!-- -->
return nil, err
}

return out.Bytes(), nil
}

Use `Output`

func UseOutput(name string, arg ...string) ([]byte, error) {<!-- -->
return exec.Command(name, arg...).Output()
}

Use Pipe

func UsePipe(name string, arg ...string) ([]byte, error) {<!-- -->
cmd := exec.Command(name, arg...)
stdout, err := cmd.StdoutPipe()
if err != nil {<!-- -->
return nil, err
}

if err = cmd.Start(); err != nil {<!-- -->
return nil, err
}

var out = make([]byte, 0, 1024)
for {<!-- -->
tmp := make([]byte, 128)
n, err := stdout.Read(tmp)
out = append(out, tmp[:n]...)
if err != nil {<!-- -->
break
}
}

if err = cmd.Wait(); err != nil {<!-- -->
return nil, err
}

return out, nil
}

Process terminated

The os.Exit() function will terminate the current process, and the corresponding system call is not _exit, but exit_group.

func Exit(code int)

Exit causes the current process to exit with the given status code code. Generally speaking, a status code of 0 indicates success, and a status code other than 0 indicates an error. The process will terminate immediately and the defer function will not be executed.