[Multi-threading] Thread control {thread creation, thread exception, program replacement in multi-threads; thread waiting, parameters and return values of thread entry functions; thread termination, thread ID, thread attribute structure, thread independent stack structure, thread local Variables; thread separation; pthread library functions}

1. Thread creation

1.1 pthread_create function

The pthread_create() function is a function used to create threads. It belongs to the pthread thread library (POSIX thread library).

The function prototype is as follows:

#include <pthread.h>

int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) (void *), void *arg);

Parameter Description:

  • thread: Pointer to pthread_t type, used to store the identifier of the newly created thread.
  • attr: Pointer to pthread_attr_t type, used to specify thread attributes, usually NULL can be passed in to use the default Attributes.
  • start_routine: Pointer to the thread function. This function is the entry point of the thread, and the thread will start executing from this function.
  • arg: Parameter passed to thread function, which can be a pointer of any type.

Function return value:

  • When the thread is successfully created, 0 is returned.
  • When thread creation fails, a non-zero error code is returned.

It should be noted that if you want to call the pthread library function, you must specify the link to the pthread library in the g++/gcc command.

g + + mythread.cc -o mythread -l pthread

test program:

void *ThreadRoutine(void *name)
{<!-- -->
    int cnt = 3;
    while (cnt--)
    {<!-- -->
        printf("[%d]: %s\\
", getpid(), (char *)name);
        sleep(1);
    }
    cnt /= 0; // divide by 0 error
    return nullptr;
}

int main()
{<!-- -->
    pthread_t tid;
    pthread_create( & amp;tid, nullptr, ThreadRoutine, (void *)"child thread 1");
    while (1)
    {<!-- -->
        printf("[%d]: main thread\\
", getpid());
        sleep(1);
    }
    return 0;
}

operation result:

  1. The execution order of threads is determined by the scheduler.

  2. An exception occurs in the thread and the entire process crashes.

1.2 Thread exception

  • If a single thread crashes due to hardware exceptions such as division by zero, wild pointers, or other exit signals, the process will also crash.
  • The thread is the execution branch of the process. If the thread is abnormal, the process is abnormal, which triggers the signal mechanism and terminates the process. When the process terminates, all threads in the process will exit immediately.
  • The signals we learned before are based on processes as the basic carrier. Each thread shares the signal received by the process and the signal processing method (pending signal set, handler signal processing method table).

1.3 Program replacement in multi-threads

In a multi-threaded process, if any thread calls the exec function to replace the process program, the operating system will load the new executable file into the address space of the current process and reinitialize the thread resources of the process. . This means that original threads and thread-related resources will be destroyed, including thread stacks, thread local variables, thread context, etc.

test program:

//mythread.cc
void *ThreadRoutine(void *name)
{<!-- -->
    cout << "child thread running..." << endl;
    sleep(2);
    execl("./test", "test", nullptr); // Process program replacement
    return nullptr;
}

int main()
{<!-- -->
    cout << "main thread running..." << endl;
    sleep(2);
    pthread_t tid;
    pthread_create( & amp;tid, nullptr, ThreadRoutine, nullptr);
    while(1) sleep(1);
}

//test.cc
int main()
{<!-- -->
    while (1)
    {<!-- -->
        cout << "hello world!" << endl;
        sleep(1);
    }
}

operation result:

2. Thread waiting

After creating a sub-thread, the main thread also needs to wait for the sub-thread to exit, obtain the sub-thread exit result, and then the operating system recycles the sub-thread PCB.

If the main thread does not wait for the child thread, it will cause problems like zombie processes, resulting in memory leaks.

2.1 pthread_join function

The pthread_join function is a function used to wait for thread termination.

The syntax of the pthread_join function is as follows:

#include <pthread.h>
int pthread_join(pthread_t thread, void **status);

parameter:

thread: The thread identifier of the thread to wait for. The pthread_join function will block the waiting thread until the specified thread terminates.

status: Pointer to the location where the thread’s exit result is stored (secondary pointer). Once the thread terminates, its exit result is stored at the location pointed to by the status pointer. If you don’t care about the exit result of the thread, you can set the status parameter to NULL. Note that the return value of the thread function is void*, so a secondary pointer must be used as the output parameter.

return value:

  • If the waiting thread succeeds, 0 is returned.
  • The waiting thread fails and returns a non-zero error code.

Note: Unlike process waiting, thread waiting does not need to care whether the child thread is abnormal, because once the child thread is abnormal, the entire process will crash. The thread’s abnormal exit signal is the process’s abnormal exit signal.

test program:

void *ThreadRoutine(void *name)
{<!-- -->
    int cnt = 3;
    while (cnt--)
    {<!-- -->
        printf("[%d]: %s\\
", getpid(), (char *)name);
        sleep(1);
    }
    cout << "child thread quit!" << endl;
    //The thread function returns and the thread exits
    return (void *)10; //Return the address of literal constant 10 (read-only constant area)
}

int main()
{<!-- -->
    printf("[%d]: main thread\\
", getpid());
    pthread_t tid;
    pthread_create( & amp;tid, nullptr, ThreadRoutine, (void *)"child thread 1");
   void *ret;
    pthread_join(tid, & amp;ret); // Block waiting for the child thread to exit
    cout << "main thread quit!" << endl;
    cout << "child thread return value: " << (long long)ret << endl;
    return 0;
}

operation result:

The main thread calls pthread_join to block and wait for the child thread, allowing the thread exit to have a certain order, and allowing the main thread to perform more finishing work in the future.

2.2 pthread_tryjoin_np function (extension)

In C++, you can use the pthread_tryjoin_np function to perform non-blocking waiting for thread functions.

#include <pthread.h>
int pthread_tryjoin_np(pthread_t thread, void **retval);

The pthread_tryjoin_np function will try to wait non-blockingly for the end of the thread function.

  • If the thread function has ended, pthread_tryjoin_np will return 0, indicating that the thread ended successfully;
  • If the thread function has not yet ended, pthread_tryjoin_np will return EBUSY, indicating that the thread has not yet ended;
  • If other errors occur, pthread_tryjoin_np will return the corresponding error code.

Test code:

#include <iostream>
#include <pthread.h>

void *threadFunction(void *arg)
{<!-- -->
//Logic of thread function
sleep(5);
return nullptr;
}

int main()
{<!-- -->
pthread_t thread;
//Create a thread and execute the thread function
pthread_create( & amp;thread, nullptr, threadFunction, nullptr);
int res;
//Try non-blocking wait thread function
while (res = pthread_tryjoin_np(thread, nullptr))
{<!-- -->
if (res == EBUSY)
{<!-- -->
std::cout << "The thread has not ended yet" << std::endl;
}
else
{<!-- -->
std::cout << "An error occurred" << std::endl;
}
sleep(1);
}

std::cout << "Thread ended successfully" << std::endl;

return 0;
}

operation result:

2.2 Parameters and return values of thread entry point functions

The parameters of the thread entry point function are passed in through the pthread_create function; the return value is received through pthread_join.

test program

void *ThreadRoutine(void *data)
{<!-- -->
    printf("[%d]: %s\\
", getpid(), "child thread running!");
    // Sub-thread processes heap space data
    for (int i = 0; i < 10; + + i)
    {<!-- -->
        ((int *)data)[i] = i;
    }
    printf("[%d]: %s\\
", getpid(), "child thread quit!");
    //The thread function returns and the thread exits
    return (void *)data; // Returns a pointer to the heap space
}

int main()
{<!-- -->
    printf("[%d]: main thread running!\\
", getpid());
    //Create a batch of heap data
    int *data = new int[10]{<!-- -->0};
    cout << "before: ";
    for (int i = 0; i < 10; + + i)
    {<!-- -->
        cout << data[i] << " ";
    }
    cout << endl;
    
// Pass the heap space data to the child thread for processing
    pthread_t tid;
    pthread_create( & amp;tid, nullptr, ThreadRoutine, (void *)data);
    int *ret;
    pthread_join(tid, (void **) & amp;ret); // Block waiting for the child thread to exit

    //Print the processed heap space data
    cout << "after: ";
    for (int i = 0; i < 10; + + i)
    {<!-- -->
        cout << data[i] << " ";
    }
    cout << endl;
    for (int i = 0; i < 10; + + i)
    {<!-- -->
        cout << ret[i] << " ";
    }
    cout << endl;
    
    printf("[%d]: main thread quit!\\
", getpid());
    //The heap space pointed to by data and ret is the same, and the address is the same
    printf("[%d]: child thread return value: %p:%p\\
", getpid(), ret, data);
}

operation result:

The parameters and return values of the thread entry point function can not only be the address of literal constants, but also the address of the heap space. Because each thread shares the address space of the process, including the heap space. So we can pass a batch of data to the sub-thread through the heap space. Of course, the sub-thread can also pass a batch of data to the main thread.

3. Thread termination

The methods to terminate a thread are:

  • Method 1: Execute the return statement in the sub-thread entry point function to terminate the sub-thread.
  • Method 2: Call the pthread_exit function at anywhere in the child thread, in any function to directly terminate the child thread.
  • Method 3: The main thread calls the pthread_cancel function to terminate the child thread with the specified tid.

Note: Do not call the exit function in the child thread. Calling exit in the child thread will stop the entire process.

3.1 pthread_exit

The pthread_exit function is a thread termination function that terminates the execution of the current thread and returns a specified exit result.

The function prototype is as follows:

void pthread_exit(void *retval);

Parameter Description:

  • retval: Specifies the exit result of the thread, which can be a pointer of any type.

Precautions:

  • When a thread calls the pthread_exit function, it immediately terminates the execution of the current thread and returns retval as the thread’s exit result to other threads waiting for the thread.

  • If the main thread calls the pthread_exit function, the entire process will terminate.

3.2 pthread_cancel

The pthread_cancel function is used to send a cancellation request to the specified thread to request to terminate the execution of the thread.

The function prototype is as follows:

int pthread_cancel(pthread_t thread);

Parameter Description:

  • thread: The identifier of the thread to be canceled.

Precautions:

  • When the pthread_cancel function is called, a cancellation request is sent to the specified thread.
  • The canceled thread can choose to terminate its execution at an appropriate time and return PTHREAD_CANCELED(-1) as the exit result.
  • The child thread should not call pthread_cancel to send a cancellation request to the main thread, because the main thread is responsible for waiting for all child threads to exit, and canceling the main thread may also affect the entire process.
  • Do not send a cancellation request to the tid of the current thread. This behavior is undefined and various unexpected errors may occur.

test program:

void *ThreadRoutine(void *arg)
{<!-- -->
    while(true)
    {<!-- -->
        cout << "child thread running..." << endl;
        sleep(1);
    }
    pthread_exit((void *)12); // The child thread will not execute here
}

int main()
{<!-- -->
    pthread_t tid;
    pthread_create( & amp;tid, nullptr, ThreadRoutine, nullptr);
    sleep(3);
    pthread_cancel(tid); //Send a cancellation request to the child thread
    void *ret;
    pthread_join(tid, & amp;ret);
    cout << "cancel child thread, tid: " << tid << " retval: " << (long long)ret << endl;
    sleep(2);
    return 0;
}

operation result:

3.3 Thread ID

3.3.1 Thread ID and LWP

Observe that the thread identifier (tid) obtained by pthread_create in the running results is not the value of LWP we expected.

  • LWP belongs to the category of process scheduling. Because threads are lightweight processes and are the basic unit of operating system scheduling, a numerical value is needed to uniquely represent the thread.
  • The thread ID (tid) obtained by pthread_create is an address, pointing to a virtual memory unit, which belongs to the category of pthread thread library. Subsequent operations of the thread library operate threads based on the thread ID.

The pthread_self function is a function in the pthread library and is used to obtain the thread ID (Thread ID) of the current thread.

The function prototype is as follows:

pthread_t pthread_self(void);

The pthread_self function has no parameters and directly returns the thread ID of the current thread, which is a value of type pthread_t.

Thread ID is a unique identifier used to distinguish different threads. In a multi-threaded program, each thread has its own thread ID. Through the pthread_self function, you can obtain the thread ID of the current thread for thread identification and management.

The pthread_t type is an opaque data type, actually a structure pointer. It is usually used as an identifier for a thread and is used to create, operate, and wait for threads.

3.3.2 Thread attribute structure

When creating and running threads, we call the pthread library function, which is not a system call directly provided by the Linux system.

The Linux system does not provide thread-related interfaces or specialized thread structures. Instead, it uniformly provides lightweight process interfaces and kernel data structures, and is only responsible for scheduling and executing lightweight processes. In other words, from the perspective of the kernel, there is no structural difference between processes and threads.

However, each thread still needs to have its own attributes and resources, such as thread ID, thread exit result, stack structure, etc. These attributes and resources cannot be reflected in the kernel data structure. Therefore, in addition to providing thread-related operations, the pthread library also specially designed a thread attribute structure for threads as supplementary data for the kernel lightweight process. The thread ID (pthread_t type) is actually the pointer of the structure.

When calling pthread_create to create a thread: On the one hand, the system will create the kernel data structure of the lightweight process, such as task_struct, etc. On the other hand, the pthread thread library will also create a thread attribute structure in the virtual memory (shared area) of the dynamic library to store the related attributes and resources of the thread. This includes the thread’s independent stack structure, thread local storage, etc.

3.3.3 Independent stack structure of threads

How does the pthread library specify the stack structure of a thread? The pthread_create function encapsulates the system call clone at the bottom. clone is used to create a new process or thread. Its function prototype is as follows:

int clone(int (*fn)(void *), void *child_stack, int flags, void *arg, ...);

The parameter fn is a pointer to the function to be executed by the new process or thread; the parameter child_stack is a pointer to the stack space of the new process or thread. pthread_create will pass the address of the stack structure created in the dynamic library to the clone system call as the child_stack parameter. The operating system will use it as the stack structure of the new thread. When the new thread is scheduled for execution, it will use the stack structure in the dynamic library.

Note: The main thread uses the kernel-level stack structure (the stack area in the address space), while each sub-thread uses an independent stack structure in the dynamic library (the shared area in the address space). Each thread calls the stack and executes functions without affecting each other.

3.3.4 Local storage of threads

  • Global variables of a process are shared by all processes.
  • __thread modifies the global variable so that the global variable is independently owned by each thread. This is the local storage of the thread.
  • Thread local storage:
    • Scope: the global scope of the thread;
    • Storage location: Address space shared area –> pthread_dynamic library –> Thread attribute structure;
    • Access permissions: owned independently by each thread.

test program:

//int g_val = 0;
__thread int g_val = 0;

void* ThreadRoutine(void* name)
{<!-- -->
    while(true)
    {<!-- -->
        sleep(1);
        cout << (char*)name << pthread_self() << " g_val: " << g_val << " & amp;g_val: " << & amp;g_val << endl;
         + + g_val;
    }

}

int main()
{<!-- -->
    pthread_t tid;
    pthread_create( & amp;tid, nullptr, ThreadRoutine, (void*)"child thread 1");
    while(true)
    {<!-- -->
        cout << "main thread: " << pthread_self() << " g_val: " << g_val << " & amp;g_val: " << & amp;g_val << endl;
        sleep(1);
    }
}

operation result:

4. Thread separation

  • By default, newly created threads are joinable. After the thread exits, it needs to perform a pthread_join operation, otherwise resources cannot be released, causing system leaks.
  • If you don’t care about the return value of the thread, join is a burden. At this time, we can tell the system to automatically release the thread resources when the thread exits.

The pthread_detach function is used to mark a thread as detached. A detachable thread will automatically release its resources after its execution ends, without the need for other threads to call the pthread_join function to wait for the thread to end.

The function prototype is as follows:

int pthread_detach(pthread_t thread);

parameter:

thread: It is a thread identifier of type pthread_t, used to specify the thread to be separated.

return value:

  • If the waiting thread succeeds, 0 is returned.
  • The waiting thread fails and returns a non-zero error code.

The pthread_detach function is used to mark the specified thread as detachable. Once a thread is marked as detachable, its resources will be automatically released when the thread ends, without the need for other threads to call the pthread_join function to wait for it to end.

It should be noted that the pthread_detach function must be called before the target thread has called the pthread_join function by other threads. If the target thread is already waiting for another thread to call the pthread_join function, then calling the pthread_detach function will fail.

An exception in the detached thread will still affect the entire process and cause the entire process to crash.

test program:

void *ThreadRoutine(void *arg)
{<!-- -->
    pthread_detach(pthread_self()); // Thread detachment
    cout << "child thread running..." << endl;
    sleep(2);
    pthread_exit((void*)12);
}

int main()
{<!-- -->
    cout << "main thread running..." << endl;
    pthread_t tid;
    pthread_create( & amp;tid, nullptr, ThreadRoutine, nullptr);
    sleep(1);
    int ret = pthread_join(tid, nullptr);
    cout << "ret = " << ret << ", strerrno: " << strerror(ret) << endl;
}

operation result:

The waiting thread failed and an error was reported: Invalid parameter (tid). You cannot call pthread_join to wait for a thread that has been separated.

5. Language-level multi-threading interface

  • Under the Linux platform, multi-threaded interfaces at the language level such as C++, Java, and Python must also encapsulate the pthread thread library at the bottom layer. Therefore, you must specify the link pthread library in the g++ command when compiling.
  • There are two reasons for providing a language-level multi-threaded interface: 1. To simplify operations and make it easier for users to use. 2. In order to realize the cross-platform nature of the language