[Linux System Standard Process Resources] Analysis of the resources required to create the most basic process in Linux, and the differences between thread resources

Directory title

Analysis of resources required to create a basic process in Linux
- 1. Memory resources
- 2. CPU time
- 3. File descriptor
- 4. Process ID and parent process ID
- 5. Environment variables and command line parameters
- 6. Kernel data structure
- 7. Memory usage evaluation
- - 7.1 Basic memory requirements
  - 7.2 Specific values of memory usage
  - 7.3 Process ID (PID)
  - 7.4 File Descriptor
  - 7.5 Summary
The difference in resource consumption of threads in processes
Conclusion

Analysis of resources required to create a basic process in Linux

Resource Type	Description	Insights
Memory resources	Storage code and data	Memory is the basis for the existence of the process
CPU time	Execute code	CPU time determines the execution speed of the process
File descriptor	Identifies files and network connections	Files are a way for processes to interact with the outside world
Process ID	Uniquely identifies a process	Process ID It is the “ID card” of the process
Environment variables	Storage configuration information	Environment variables affect the behavior of the process

1. Memory resources

When Linux creates a process, it first needs to allocate memory space to store the code, data, heap and stack of the process. These memory areas typically include:

Code Segment
Data Segment
Heap
Stack

Code example:

#include <unistd.h>

int main() {<!-- -->
    fork(); // Create a new process
    return 0;
}

2. CPU time

After a process is created, it requires CPU time to execute its code. This involves process scheduling, where the operating system decides which process should use the CPU.

3. File descriptor

Each process has a set of file descriptors that identify open files and network connections.

Code example:

#include <fcntl.h>

int main() {<!-- -->
    int fd = open("file.txt", O_RDONLY); // Open a file
    return 0;
}

4. Process ID and parent process ID

Each process has a unique process ID (PID) and a parent process ID (PPID).

5. Environment variables and command line parameters

The process also needs to store environment variables and command line arguments.

6. Kernel data structure

The operating system kernel maintains some data structures, such as process tables, to track the status of processes.

Source code angle:
In the Linux source code, process creation is mainly implemented through the fork system call, specifically in the do_fork function of the kernel/fork.c file.

7. Memory usage evaluation

7.1 Basic memory requirements

In a Linux system, the memory footprint of a basic process mainly consists of the following parts:

Code Segment: Stores the machine code of the program.
Data Segment: Stores global variables and static variables.
Heap: Dynamically allocated memory space.
Stack: Stores local variables and function call information.
Kernel Stack: Stores the execution information of the process in the kernel state.
Page Table: Stores mapping information from virtual addresses to physical addresses.

7.2 Specific values of memory usage

The specific memory usage value will vary depending on the system and compiler. Generally speaking, the memory footprint of an empty C program on a 64-bit Linux system is approximately as follows:

Code snippet: ~4KB
Data segment: about 1KB
Heap: usually empty initially and allocated on demand
Stack: about 8KB (can grow dynamically)
Kernel stack: ~8KB
Page table: about 4KB

Total: approximately 25KB

Source code angle:
In the Linux source code, the memory layout and allocation of the process are mainly defined in the mm (Memory Management) module. For details, you can view mm/mmap.c and mm/ page_alloc.c file.

7.3 Process ID (PID)

Process ID is usually an integer that uniquely identifies a process. On most Linux systems, this is a 4-byte (32-bit) integer. Therefore, each process’s PID will occupy 4 bytes of memory.

7.4 File Descriptor

A file descriptor (File Descriptor) is a non-negative integer used to identify an open file or other I/O resource. File descriptors are typically stored in a process’s file descriptor table, which is an array structure. Each file descriptor is usually a 4-byte (32-bit) integer.

Assuming that a process has N files open, the size of the file descriptor table is roughly N * 4 bytes.

7.5 Summary

Resource type	Memory usage	Description
Code segment, data segment, heap and stack	Several KB to several MB	Storage program code and data
Process ID (PID)	4 bytes	Uniquely identifies a process
File descriptor	N * 4 bytes	Identifies open files and network connections
Kernel data structure	Dozens to hundreds of bytes	Storage process status, scheduling information, etc.

This table summarizes the memory resources required to create a basic process in Linux. In general, at least a few KB of memory is required. This does not include other types of resources such as CPU time, environment variables, etc. I hope this summary can answer your questions more accurately.

The difference in resource consumption of threads in processes

The main differences between threads and processes are as follows in terms of memory and resource usage:

Shared resources: Threads share the same address space, code segments, data segments, open files, etc., so there is no need to allocate independent memory for these like processes.
Stack space: Each thread has its own stack space, which is their main memory overhead.
Kernel data structures: Threads typically have fewer kernel data structures, which means they are usually more lightweight than processes.
File descriptors: Threads usually share the same set of file descriptors, while processes have their own independent file descriptors.
Register state: Each thread has its own register state, but this is usually less than the process’s register state.

Overall, threads generally save more memory and other resources than processes, which is why threads are often more popular in applications that require high concurrency.

Resource Type	Process	Thread	Description
Code segment, data segment	Independent	Shared	Store the code and data of the program
Heap	Independent	Shared	Dynamically allocated memory space
Stack	Independent	Independent	Store local variables and function call information
Process/thread ID	4 bytes	4 bytes	Uniquely identifies a process or thread
File descriptor	Independent	Shared	Identifies open files and network connections
Kernel data structures	Independent	Less	Storage process/thread status, scheduling information, etc.
Register status	Independent	Independent	Storing CPU register status

From this table you can see:

Threads share code segments, data segments, and heaps with processes, which reduces memory usage.
Threads have their own independent stack and register state, but are generally more lightweight than processes.
Threads share file descriptors, while processes have their own.

This should more clearly demonstrate the difference in resource usage between threads and processes. I hope this time I can answer your questions.

Conclusion

In our programming learning journey, understanding is an important step for us to move to a higher level. However, mastering new skills and ideas always requires time and persistence. From a psychological point of view, learning is often accompanied by constant trial and error and adjustment, which is like our brain gradually optimizing its “algorithm” for solving problems.

This is why when we encounter mistakes, we should view them as opportunities to learn and improve, rather than More than just an obsession. By understanding and solving these problems, we can not only fix the current code, but also improve our programming skills and prevent making the same mistakes in future projects.

I encourage everyone to actively participate and continuously improve their programming skills. Whether you are a beginner or an experienced developer, I hope my blog will be helpful on your learning journey. If you find this article useful, you may wish to click to bookmark it, or leave your comments to share your insights and experiences. You are also welcome to make suggestions and questions about the content of my blog. Every like, comment, share and attention is the greatest support for me and the motivation for me to continue sharing and creating.

Read my CSDN homepage and unlock more exciting content: Bubble’s CSDN homepage