C language programming – pthread thread operation

Table of Contents

Article directory

  • Table of contents
  • user thread
  • Multi-thread switching
  • pthread thread library
  • Thread creation and destruction
    • pthread_create()
    • pthread_join()
    • pthread_exit()
    • pthread_detach()
  • Multi-thread safety and multi-thread synchronization
    • Mutex
      • pthread_mutex_init()
      • pthread_mutex_lock()
      • pthread_mutex_unlock()
    • Condition Variables
      • pthread_cond_init()
      • pthread_cond_wait()
      • pthread_cond_signal()
      • pthread_cond_broadcast()
    • Using Mutex Locks with Condition Variables
  • Thread-unsafe standard library functions

User threads

User Thread is a thread that User Process is responsible for creating, scheduling, and destroying. It runs in User Space, and Kernel does not perceive or manage it.

A User Process can have multiple User Threads, and at least one Main Thread (initial thread). Main Thread is the first Thread of User Process (TID is 1), and it is essentially a User Thread, which is used to execute the program code of User Process itself from main().

When a developer creates multiple User Threads through the pthread library, these User Threads will share the resources of the same User Process. For User Thread, it only has the resources necessary to execute on the CPU, such as: PC (program counter), Registers (registers) and Stack (stack). So User Thread is very lightweight and suitable for high concurrency scenarios.

Multi-thread switching

Since User Thread is not perceived by Kernel, it does not participate in direct CPU scheduling, but User Thread also has switching behavior on the same CPU. That is: Multiple User Threads in the same User Process will only switch threads on the CPU scheduled by the User Process.

The switching of User Thread is completely determined by the User Process, that is, it is controlled by the code logic implemented by the developer, so it is usually called “cooperative scheduling”, that is, multiple User Threads negotiate with each other to give up the CPU. Therefore, there is no CPU switching consumption between user mode and kernel mode in the cooperative scheduling of User Thread.

pthread thread library

In pthread, use TCB (Thread Control Block, thread control block) to store all the information of User Thread, and the volume of TCB will be much smaller than that of PCB. The corresponding pthread structure is as follows:

//pthread/pthread_impl.h

struct pthread {<!-- -->
    struct pthread *self; // pointer to itself
    struct __pthread_internal_list *thread_list; // thread list, pointer to thread list, used to implement thread pool;
    void *(*start_routine)(void*); // Thread entry function, passed in by pthread_create() function;
    void *arg; // The entry function parameter of the thread, which is passed in by the pthread_create() function;
    void *result; // The return value of the thread, returned by the entry function of the thread;
    pthread_attr_t *attr; // Thread attributes, including stack protection area size, scheduling strategy, etc., are passed in by the pthread_create() function;
    pid_t tid; // The unique identifier of the thread, allocated by Kernel;
    struct timespec *waiters; // wait time stamp
    size_t guardsize; // stack guard size
    int sched_policy; // scheduling policy
    struct sched_param sched_params; // scheduling parameters
    void *specific_1stblock; // the first block of thread-private data
    struct __pthread_internal_slist __cleanup_stack; // clean up the function stack
    struct __pthread_mutex_s *mutex_list; // list of mutexes held by threads
    struct __pthread_cond_s *cond_list; // list of condition variables for threads to wait for
    unsigned int detach_state:2; // Thread detach state, including detached and undetached;
    unsigned int sched_priority:30; // thread scheduling priority
    unsigned int errno_val; // thread error code
};

The pthread library provides a series of interfaces around struct pthread to complete User Thread management.

Creation and destruction of threads

pthread_create()

Function role: used to create a new thread, and specify the entry function and parameters of the thread.

Function prototype:

  • thread parameter: It is a pointer of pthread_t type, which is used to store TID. TID is randomly assigned by the Kernel.
  • attr parameter: It is a pointer of pthread_attr_t type, which is used to specify the attributes of the thread, including specifying the size of the stack protection area, scheduling policy, priority, etc., usually NULL.
  • start_routine parameter: thread entry function, which is a void* type function pointer (or directly use the function name). The thread entry function must be a static static function or a global function, because pthread will pass the return value of the thread entry function to pthread_join(), so it needs to be able to find it.
  • arg parameter: thread parameter, which is a void* type parameter.
  • Function return value:
    • success: return 0;
    • Failure: return -1;
int pthread_create(pthread_t *thread, const pthread_attr_t *attr,
                   void *(*start_routine) (void *), void *arg);

After Main/Parent Thread calls pthread_create(), the pthread library will allocate resources such as TCB, PC (program counter), Registers (registers) and Stack (stack) for Child Thread. Then the pthread library will initialize the TCB and add it to the Thread Pool for execution. When the User Thread is dispatched to the CPU, the thread entry function is executed.

pthread_join()

Function role: If the Parent Thread wants to obtain the execution result of the Child Thread, it can implement it through pthread_join(), which is used to wait for the specified Child Thread to exit and obtain the return value of the thread entry function. At this point the Parent Thread will be blocked.

Function prototype:

  • thread parameter: Specifies the TID to wait for.
  • retval: It is a pointer type to a pointer, used to store the exit code of Child.
int pthread_join(pthread_t thread, void **retval);

After the Parent Thread calls pthread_join(), it will start Sleeping and wait for the execution result of the Child Thread.

pthread_exit()

Function role: User Thread terminates itself immediately after calling, and returns an exit code.

Function prototype:

  • retval: It is a pointer type used to store the exit code. Set to NULL if no return value is required.
void pthread_exit(void *retval);

pthread_exit() will automatically release all resources occupied by the current User Thread. But it should be noted that if the Main Thread calls pthread_exit(), then the entire User Thread will be terminated, so extra care is required.

pthread_detach()

Function: It is used to set the thread to the detachable state. When a thread in a detachable state exits, the system can automatically reclaim thread resources without calling the pthread_join function to recycle.

Function prototype:

  • thread parameter: Specifies the TID.
int pthread_detach(pthread_t thread);

Multi-thread safety and multi-thread synchronization

Multi-Thread Safe (Multi-Thread Safe), that is, in a multi-threaded environment, multiple threads write to the same shared data (Shared Resource, e.g. registers, memory space, global variables, static variables, etc.) at the same time ( When the read operation does not involve thread safety issues), there will be no data inconsistency.

In order to ensure multi-thread safety, it is necessary to ensure data consistency, that is: thread safety check. Synchronous communication is required between multiple threads to ensure the consistency of shared data.

The pthread library provides ways to ensure thread safety:

  1. Mutex: It is a thread safety mechanism that adds a lock to the shared data, and only the thread that owns the lock can access the shared data. In this way, shared data is protected from simultaneous access by multiple threads.
  2. Condition Variable: It is a thread synchronization mechanism used to determine whether a thread meets a specific race condition (Race Condition). Only threads that meet the conditions can obtain the mutex to avoid deadlock.

It should be noted that the implementation of thread safety checks will bring certain system overhead.

Mutex

pthread_mutex_init()

Function: used to initialize a mutex entity.

Function prototype:

  • mutex parameter: pthread_mutex_t type pointer, used to specify the mutex to be initialized.
  • attr parameter: pthread_mutexattr_t type pointer, used to specify the attributes of the mutex, such as: recursive lock, non-recursive lock, etc., usually NULL.
int pthread_mutex_init(pthread_mutex_t *mutex, const pthread_mutexattr_t *attr);

pthread_mutex_lock()

Function: User Thread is used to acquire a mutex. If the mutex has been acquired by Other User Thread, the current User Thread will block.

Function prototype:

  • mutex parameter: pthread_mutex_t type pointer, used to specify the mutex to be acquired.
int pthread_mutex_lock(pthread_mutex_t *mutex);

pthread_mutex_unlock()

Function: User Thread is used to release the mutex, and the mutex returns to the usable state. If the current User Thread does not have a lock, this function may produce undefined behavior.

Function prototype:

  • mutex parameter: pthread_mutex_t type pointer, used to specify the mutex to be released.
int pthread_mutex_unlock(pthread_mutex_t *mutex);

Condition Variable

Both mutexes and condition variables are tools for multi-thread synchronization, but their functions are different:

  • Mutual exclusion: Mutual exclusion can protect access to shared resources and prevent multiple threads from modifying shared resources at the same time, but it cannot tell other threads when they can safely access shared resources, which may lead to deadlock occur.

For example, there is a global variable n (shared data) that is accessed by multiple threads. After TA acquires the lock, access n in the critical section, and only when n > 0, the lock will be released. This means that when n == 0, TA will never release the lock, causing a deadlock.

Then the way to solve the deadlock is to set a condition: only when n >, TA can acquire the lock. And this condition is the information that needs to be synchronized between multiple threads. That is: in a multi-threaded environment, when a thread needs to wait for a certain condition to be established before it can acquire a lock, it should be implemented using a condition variable.

  • Synchronization: The pthread condition variable provides a thread synchronization mechanism. When a specific event occurs, it can wake up one or more threads waiting for the event, thereby achieving synchronization and coordination among threads. Condition variables are often used with mutexes to avoid race conditions and deadlocks.

pthread_cond_init()

Function role: used to initialize a condition variable entity.

Function prototype:

  • cond parameter: pthread_cond_t type pointer, used to specify the condition variable to be initialized.
  • attr parameter: pthread_condattr_t type pointer, used to specify the attribute of the condition variable, usually NULL.
int pthread_cond_init(pthread_cond_t *cond, const pthread_condattr_t *attr);

pthread_cond_wait()

Function: The thread is used to wait for a certain condition variable to be satisfied.

  1. When the T1 thread calls pthread_cond_wait(), it will automatically release the mutex, block the thread, and start waiting.
  2. Until another T2 thread calls pthread_cond_signal() or pthread_cond_broadcast() to notify T1 that the condition variable is satisfied.
  3. Then T1 pthread_cond_wait() reacquires the specified mutex and returns.

Function prototype:

  • cond parameter: pthread_cond_t type pointer, used to specify the condition variable to wait for.
  • mutex parameter: pthread_mutex_t type pointer, used to specify the mutex to be associated. During the wait, the thread will release the mutex.
int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex);

pthread_cond_signal()

Function: Used to send a signal to the threads waiting for the condition variable to wake up one of the threads.

Function prototype:

  • cond parameter: pthread_cond_t type pointer, used to specify the condition variable to send the signal.
int pthread_cond_signal(pthread_cond_t *cond);

pthread_cond_broadcast()

Function: It is used to send a signal to all threads waiting for the condition variable, and wake up all waiting threads.

Function prototype:

  • cond parameter: pthread_cond_t type pointer, used to specify the condition variable to send the signal.
int pthread_cond_broadcast(pthread_cond_t *cond);

Use mutexes and condition variables

When a thread needs a condition to be met before it can access shared data. You need to lock a mutex first, then check the condition variable, and if the condition is not met, you need to hang and wait.

#include <pthread.h>

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;

int data = 0; // shared data

void *producer(void *arg)
{<!-- -->
    for (int i = 0; i < 10; i ++ ) {<!-- -->
        pthread_mutex_lock( & amp;mutex); // lock
        data ++ ; // modify shared data
        pthread_cond_signal( & amp;cond); // send signal
        pthread_mutex_unlock( & amp;mutex); // unlock
        sleep(1);
    }
    pthread_exit(NULL);
}

void *consumer(void *arg)
{<!-- -->
    while (1) {<!-- -->
        pthread_mutex_lock( & amp;mutex); // lock
        while (data == 0) {<!-- --> // wait for signal if there is no data
            pthread_cond_wait( &cond, &mutex);
        }
        printf("data = %d\\
", data); // print shared data
        data--; // Modify shared data
        pthread_mutex_unlock( & amp;mutex); // unlock
        sleep(1);
    }
    pthread_exit(NULL);
}

int main()
{<!-- -->
    pthread_t tid1, tid2;

    pthread_create( & tid1, NULL, producer, NULL);
    pthread_create( & tid2, NULL, consumer, NULL);

    pthread_join(tid1, NULL);
    pthread_join(tid2, NULL);

    return 0;
}

Thread-unsafe standard library functions

Most of the standard library functions provided by the C language are thread-safe, but there are also several commonly used functions that are thread-unsafe, also known as non-reentrant functions, because some global or static variables are used.

We know that global variables and static variables correspond to the global variable area and static storage area in memory respectively, and these areas can be accessed across threads. In a multi-threaded environment, if these data are read and written in parallel without locking, it will cause problems such as Segmentfault / CoreDump.

  • Summary of non-reentrant functions: