How to Analyze Linux Process Resources Using the psutil Module

psutil The installation method and official documents of the module are detailed in: https://psutil.readthedocs.io/en/latest/

Meaning of each process statistics

pid: PID of the process

Return the pid of the process.

cpu_num: CPU core running the current process

Returns the CPU cores on which the current process is running. The return value will be less than or equal to the number of logical cores of the CPU.

cpu_percent: Process CPU utilization

Returns the CPU utilization of a process. This percentage can also be greater than 100% if the process runs multiple threads on different CPUs.

  • When the parameter interval > 0.0, it will block the duration of interval and count the CPU utilization during this period;
  • When the parameter interval == 0 or interval is None
    • If called for the first time, it will return meaningless 0.0;
    • If it is not the first call, the CPU utilization during the period from the last call will be counted (it is recommended to have at least 0.1 seconds between calls)
cpu_times: cumulative process CPU time

Returns the cumulative process CPU time in seconds. in:

  • user: CPU time in user mode;
  • system: CPU time in kernel mode;
  • children_user: CPU time of all child processes in user mode;
  • children_system: the CPU time of the kernel state of all child processes;
  • iowait: Time to wait for blocking I/O to complete (this value is not included in user time and system time)
create_time: process creation time

Returns the process creation time (unit: timestamp). The return value will be cached after the first call.

name: process name

Returns the process name.

ppid: the parent process of the process

Returns the parent process of the process.

status: the status of the process

Contains the following process status, the explanation in the status comes from ChatGPT:

Process Status Process Code Explanation
STATUS_RUNNING running The process is running or preparing to run.
STATUS_SLEEPING sleeping The process is in a sleeping state, waiting for an event to occur.
STATUS_DISK_SLEEP disk-sleep The process is waiting for disk I/O operations to complete.
STATUS_STOPPED stopped The process has stopped, but has not been terminated.
STATUS_TRACING_STOP tracing-stop The process has been stopped for tracing and debugging purposes.
STATUS_ZOMBIE zombie The process has exited, but its parent process has not reclaimed its resources.
STATUS_DEAD dead The process has exited and its resources have been reclaimed.
STATUS_WAKE_KILL wake-kill The process is waiting to be killed.
STATUS_WAKING waking The process is waking from sleep.
STATUS_IDLE idle The process is idle.
STATUS_LOCKED locked The process is locked and cannot run or receive signals.
STATUS_WAITING waiting The process is waiting for an event to occur.
STATUS_SUSPENDED suspended The process is suspended and no longer accepts any input or output.
STATUS_PARKED parked The process temporarily sleeps and wakes up again after certain conditions are met.
terminal: the terminal associated with the process

Returns the terminal associated with the current process.

num_ctx_switches: cumulative number of context switches

Returns the accumulated total number of voluntary conetxt switches and total number of involuntary context switches.

  • voluntary context switches: accessing a resource that is already in use and thus has to be suspended
  • involuntary context switches: own time slice runs out or is replaced by a higher priority thread
num_threads: the number of threads used by the current process

Returns the number of threads used by the current process.

username: username

Returns the username of the owner of the current process.

memory_full_info: return memory usage information

Return memory usage information. in:

< /table>

memory_maps: the mapped memory area of the process

Returns the mapped memory region of the process.

  • If parameter grouped = True, map regions with the same path are grouped together and summed over different memory fields;
  • If parameter grouped = False, each mapped region is displayed as a single entity.

Tool function

How to obtain a process instance
def get_processes_by_condition(username: str, command: str) -> List[psutil.Process]:
    """Get the instances of all processes containing the command command specified as the owner of the username in the Linux system (sorted in ascending order by pid)

    Parameters
    ----------
    username: str
        process owner name
    command: str
        command keywords (adjacent characters in the process start command)

    returns
    -------
    proc_lst : List[psutil.Process]
        list of process instances
    """
    proc_lst: List[psutil.Process] = []
    for proc in psutil.process_iter(["username", "cmdline"]):
        info = proc.info
        if info["username"] == username and command in " ". join(info["cmdline"]):
            proc_lst.append(proc)
    return proc_lst
Process resource monitoring class
class Monitor Processes:
    """Process resource monitoring class"""

    def __init__(self, username: str,
                 command: str,
                 res_dir: str,
                 encoding: str = "UTF-8",
                 interval:int = 0.1):
        """
        Get the instances of all processes containing the command command specified by the username owner in the Linux system (sorted in ascending order by pid)

        Parameters
        ----------
        username: str
            process owner name
        command: str
            command keywords (adjacent characters in the process start command)
        res_dir : str
            The path to the result file
        encoding: str, default = "UTF-8"
            encoding type
        interval : int, default = 0.1
        """
        self. username: str = username
        self.command: str = command
        self.res_dir: str = res_dir
        self. encoding: str = encoding
        self. interval: float = interval

psutil high-performance usage

Quickly obtain the proportion of a small number of eligible processes

Observe the source code in psutil.process_iter, every time this function is called, the as_dict method of each process will be called, which is equivalent to doing a as_dict for each process once code>proc. oneshot().

When we only need to filter the processes that meet the conditions, it will be wasteful to call the as_dict method of all processes every time.

class GetProcessesByCondition(ABC):
    """Get process instances that meet the conditions in the system"""

    def __init__(self):
        self._visited: Dict[int, psutil.Process] = {<!-- -->} # Visited process information
        self._cache: Dict[int, psutil.Process] = {<!-- -->} # cached process information

    @abstractmethod
    def condition(self, proc: psutil. Process) -> bool:
        """Judge whether the proc process satisfies the condition"""

    def process_iter(self) -> Generator[psutil.Process, None, None]:
        """Iterate to get all processes that meet the condition"""
        now_pid_set = set(psutil.pids()) # set of all current PIDs
        visited_pid_set = set(self._visited) # set of all PIDs that have been visited

        def add(pid_):
            """Add process to cache"""
            proc_ = psutil. Process(pid_)
            if self.condition(proc_) is True:
                self._cache[pid_] = proc_
            self._visited[pid_] = proc_

        def remove(pid_):
            """Remove process from cache"""
            self._visited. pop(pid_, None)
            self._cache.pop(pid_, None)

        # Check all new processes and select the ones that meet the conditions
        for pid in now_pid_set - visited_pid_set:
            add(pid)

        # Check all the old processes that have been filtered out, and delete the deleted processes (prevent pulling up new processes that meet the conditions with the same PID as before)
        for pid in visited_pid_set:
            try:
                proc = self._visited[pid]
                if not proc.is_running():
                    add(pid) # If the previous process is no longer running, reacquire the current process
            except psutil.NoSuchProcess:
                remove(pid) # If the previous process has reported an error, remove the current process

        # Traversing and iterating the processes that meet the conditions
        for pid, proc in sorted(self._cache.items()):
            yield process


class MyGetProcessesByCondition(GetProcessesByCondition):

    def __init__(self, username: str, command: str):
        """
        Parameters
        ----------
        username: str
            process owner name
        command: str
            command keywords (adjacent characters in the process start command)
        """
        super().__init__()
        self. username = username
        self.command = command

    def condition(self, proc: psutil. Process) -> bool:
        return proc.username() == self.username and self.command in " ".join(proc.cmdline())
syntaxbug.com © 2021 All Rights Reserved.
Indicator name description
rss(Resident Set Size) The non-swap physical memory used by the process (same as the RES column of the TOP command)
vms(Virtual Memory Size) The virtual memory used by the process (with the VIRT of the TOP command code> column)
shared Memory that may be shared with other processes (same as TOP command’s SHR column is the same)
text The memory occupied by the executable code Size (same as CODE column of TOP command)
data Refers to the size of physical memory excluding executable code (same as DATA column of TOP command)
lib Memory used by shared library files
dirty Number of dirty pages, refers to the number of pages that have not been synchronized to disk
uss(Unique Set Size) process exclusive The size of the memory that will be freed if the process is now terminated.
pss(Proportional Set Size) The amount of memory shared with other processes, the amount is divided equally during calculation to all processes that share memory. For example: If a process has 10MB exclusively and shares 10MB with another process, its PSS is 15MB.
swap The memory size that has been swapped to the hard disk