Embedded Linux application development – driver collection – synchronization and mutual exclusion ④ implementation of spin lock/semaphore/mutex

Embedded Linux application development-driver collection-synchronization and mutual exclusion ④ implementation of spin lock/semaphore/mutex

  • Chapter 1 Synchronization and Mutual Exclusion④
    • 1.5 Implementation of spinlock spinlock
      • 1.5.1 Spin lock kernel structure
      • 1.5.2 Implementation of spinlock in UP system
      • 1.5.3 Implementation of spinlock in SMP system
    • 1.6 Implementation of semaphore
      • 1.6.1 Kernel structure of semaphore
      • 1.6.2 Implementation of down function
      • 1.6.3 Implementation of up function
    • 1.7 Implementation of mutex mutex
      • 1.7.1 Kernel structure of mutex
      • 1.7.2 Implementation of mutex_lock function
        • 1.7.2.1 fastpath
        • 1.7.2.2 slowpath
      • 1.7.3 Implementation of mutex_unlock function
        • 1.7.3.1 fastpath
        • 1.7.3.2 slowpath

Chapter 1 Synchronization and Mutual Exclusion④

1.5 Implementation of spinlock

Spin lock, as the name suggests: spins in place, waiting for resources to be available, and once available, locks and occupies it.
The problem is, assuming someone else has locked it, if you run around in place, it will occupy CPU resources. How can other programs run? How to unlock it without CPU?
This question has 2 answers:
① CPU x is spinning in place, and CPU y will be unlocked later: This involves multiple CPUs and is suitable for SMP systems;
② For a single CPU system, the “spin” function of the spin lock is removed: only disabling preemption and disabling interrupts are left
I first prohibit other threads from interrupting me (preempt_disable), I slowly enjoy the critical resources, and then enable system preemption (preempt_enable) after using up, so that others can seize the resources.
Note: SMP stands for Symmetric Multi-Processors, symmetric multi-processor; UP stands for Uni-Processor, and the system has only one single-core CPU.
To understand spinlock, we need to analyze it through 2 scenarios:
① How to compete for resources at the beginning? You can’t grab both programs. This is easy to solve and can be achieved using atomic variables.
② A certain program has obtained resources, how to prevent others from using this resource at the same time.
This is what you should pay attention to when using spinlock. There will be different derived functions (_bh/_irq/_irqsave/_restore).

1.5.1 Spin lock kernel structure

The structure corresponding to spinlock is defined as follows, and different architectures may have different implementations:

The above __raw_tickets structure has two members, owner and next, which are the key to implementing spinlock in the SMP system.

1.5.2 Implementation of spinlock in UP system

For “spin lock”, its original meaning is: if the lock has not been obtained, I will wait in circles. Waiting for who to release the lock? ① Other CPU
② Other processes/threads
For a single-CPU system, there is no “other CPU”; if the kernel does not support preempt, the thread currently executing in the kernel mode cannot be preempted by other threads, that is, there is “no other process/thread”. Therefore, for single CPU systems that do not support preempt, spin_lock is an empty function and does not need to do anything else.
If the kernel of a single-CPU system supports preempt, that is, when the current thread is executing a kernel mode function, it may be preempted by other threads. At this time, the implementation of spin_lock is to call “preempt_disable()”: If you want to rob me, I will simply prohibit you from running.
In the UP system, the spin_lock function is defined as follows:

From the above code, we can see that in the UP system, spin_lock() degenerates into preempt_disable(). If the kernel used does not support preempt, then spin_lock() does not need to do anything.
For spin_lock_irq(), it degenerates into local_irq_disable() and preempt_disable() in the UP system, as shown in the following figure:

Assume that program A wants to access critical resources. There may be interrupts that also access critical resources. There may be program B that also accesses critical resources. Then use spin_lock_irq() to protect critical resources: first disable interrupts to prevent interrupts from grabbing them, and then disable preempt. Prevent other processes from grabbing it.
For spin_lock_bh(), in the UP system, it degrades to disabling software interrupts and preempt_disable(), as shown in the following figure:

For spin_lock_irqsave, it is similar to spin_lock_irq, except that it first saves the interrupt status and then disables the interrupt, as follows:

The corresponding spin_unlock function will not be explained anymore.

1.5.3 Implementation of spinlock in SMP system

To allow only one of multiple CPUs to obtain critical resources, this can be achieved by using atomic variables. But we must ensure fairness, first come, first served. For example, CPU0, CPU1, and CPU2 all call spin_lock to obtain critical resources. Whoever applies first will get it first.
To understand the implementation of spinlock in SMP systems, we need to give an example. Thanks for this article:
Linux kernel synchronization mechanism (4): spin lock
http://www.wowotech.net/kernel_synchronization/spinlock.html
wowotech is really a magical website. The authors of Linux articles in it are all labeled as “linuxer”, awesome!
I will use the example from this article to explain that there is only one seat in the restaurant, and everyone who goes to eat has to take a number first and wait for their number to be called. Note that there are two actions: the customer takes the number from the number-taking machine, and the electronic number board calls the number.
① At the beginning, the waiting number of the number taking machine is 0
② Customer A gets the number 0 from the number-taking machine, the electronic calling board displays 0, and customer A takes the seat;
The number taking machine displays the next number to be taken as 1.
③ Customer B gets the number 1 from the number-taking machine, the electronic calling board still displays 0, and customer B waits;
The number taking machine shows that the next number to be taken is 2.
④ Customer C gets the number 2 from the number-taking machine, the electronic calling board still displays 0, and customer C waits;
The number taking machine shows that the next number to be taken is 3.
⑤ Customer A leaves his seat after eating, the electronic number board displays 1, customer B’s number is equal to 1, and he takes his seat;
⑥ Customer B leaves his seat after eating, the electronic number board displays 2, customer C’s number is equal to 2, and he takes his seat;
In this example, there are 2 numbers: the “next number” displayed by the number-taking machine, which will automatically increase by 1 after the customer takes the number; the electronic calling board displays
“Current number”, it will automatically increase by 1 after the customer leaves the seat. When the number in a customer’s hand is equal to the number on the electronic calling board, the customer takes the seat. In this process, even if customers B and C arrive at the store at the same time, as long as they get different numbers from the number-taking machine, they will not fight.
Therefore, the key point is: the numbers issued by the number-taking machine must be mutually exclusive to ensure that the customer numbers are different from each other. The change of the number on the electronic call board does not need to be protected. It will only change after the customer leaves, and no one will compete for it.
In ARM architectures of ARMv6 and above, SMP systems are supported. Its spinlock structure is defined as follows:

The owner is equivalent to the electronic calling card, who is eating now. next is equivalent to a number taking machine, what is the next number? The number obtained by each CPU from the number taking machine is stored in the local variable in the spin_lock function.
The spin_lock function calling relationship is as follows, the core is arch_spin_lock:

The arch_spin_lock code is as follows:

The annotations in the figure explain the principle very clearly. Even if different individuals take numbers at the same time, they can guarantee that they will get different numbers.

Assume that the first program gets the number, and after accessing the critical resource, it calls spin_unlock. The code is as follows:

If there is another program waiting in a loop in the spin_lock function, it will immediately determine whether the next in its hand is equal to lock->tickets.owner. If it is equal, it means that it has obtained the lock.
In-depth analysis of _linux_spinlock_ implementation mechanism
https://blog.csdn.net/electrombile/article/details/51289813
In-depth analysis of Linux spin locks
http://blog.chinaunix.net/uid-20543672-id-3252604.html
Linux kernel synchronization mechanism (4): spin lock
http://www.wowotech.net/kernel_synchronization/spinlock.html

1.6 Implementation of semaphore

1.6.1 Kernel structure of semaphore

Note: This is a semaphore, not a signal. When we learned about asynchronous notifications earlier, the driver sends a signal to the application. The semaphore we are talking about now is a synchronization and mutual exclusion mechanism.
The definition and operation functions of the semaphore are defined in the Linux kernel file include\linux\semaphore.h, as follows:

After initializing the semaphore, you can use the down function or other derivatives to obtain the semaphore, and the up function to release the semaphore. We only analyze the implementation of the down and up functions.

1.6.2 Implementation of down function

If the count in semaphore is greater than 0, then the down function can obtain the semaphore; otherwise, it sleeps. When reading and modifying count, use spinlock to achieve mutual exclusion.
When sleeping, the current process should be placed in the wait_list list of the semaphore. When other processes release the semaphore, go to the wait_list to take out the process and wake it up.
code show as below:

1.6.3 Implementation of up function

If there are other processes waiting for the semaphore, the count value does not need to be adjusted. Just take out the first process waiting for the semaphore, give it the semaphore, and wake it up altogether.
If no other processes are waiting for the semaphore, count is adjusted.
The entire process needs to be protected using spinlock. The code is as follows:

1.7 Implementation of mutex mutex

1.7.1 mutex kernel structure

The definition and operation functions of mutex are defined in the Linux kernel file include\linux\mutex.h, as follows:

After initializing the mutex, you can use the mutex_lock function or other derivatives to obtain the semaphore, and use the mutex_unlock function to release the semaphore. We only analyze the implementation of mutex_lock and mutex_unlock functions.
Let me make a mistake here: in the previous video, we said that the owner in mutex is used to record the process of obtaining the mutex, and it must release the mutex in the future. This is wrong!
As can be seen from the above code, the owner does not necessarily exist!
owner has 2 uses: debug(CONFIG_DEBUG_MUTEXES) or spin_on_owner(CONFIG_MUTEX_SPIN_ON_OWNER). What is spin on owner?
The purpose of using mutex is generally to protect a small piece of code, which runs very quickly. This means that a process that acquires a mutex may soon release the mutex.
This can be optimized, especially if the process that currently obtains the mutex is running on another CPU, and “I” is the only process waiting for this mutex. In this case, “I” just spin and wait: I am too lazy to go to sleep, and waking up from sleep will be too slow.
Therefore, mutex has been specially optimized and is more efficient than semaphore. However, in terms of code, there is no requirement that “whoever obtains the mutex must release the mutex”, but the usage convention is “whoever obtains the mutex must release the mutex”.

1.7.2 Implementation of mutex_lock function

1.7.2.1 fastpath

The design of mutex is very sophisticated, more complex than semaphore, but more efficient.
First of all, you must know that there are two paths (fast and slow) in the mutex operation function: fastpath and slowpath: if fastpath is successful, there is no need to use slowpath.
How to understand?
This requires expanding the count value in metex. It was said before that it only has two values: 1 and 0. 1 means unlocked, 0 means locked. There is also a type of value “negative number” that means “locked, and there may be other programs waiting. “.
code show as below:

Let’s take a look at the fastpath function: __mutex_fastpath_lock. This function is defined in the following 2 files:

include/asm-generic/mutex-xchg.h
include/asm-generic/mutex-dec.h

Which file to use? Take a look at arch/arm/include/asm/mutex.h, the contents are as follows:

#if __LINUX_ARM_ARCH__ < 6
#include <asm-generic/mutex-xchg.h>
#else
#include <asm-generic/mutex-dec.h>
#endif

Therefore, for architectures below ARMv6, use the __mutex_fastpath_lock function in include/asm-generic/mutex-xchg.h; for architectures ARMv6 and above, use the __mutex_fastpath_lock function in include/asm-generic/mutex-dec.h . The __mutex_fastpath_lock function in these two files is similar. The code in mutex-dec.h is as follows:

In most cases, the current value of mutex is 1, so the mutex can be obtained very quickly through the fastpath function.

1.7.2.2 slowpath

If the current value of mutex is 0 or a negative number, you need to call __mutex_lock_slowpath to process it slowly: it may sleep and wait.

The __mutex_lock_common function is also implemented in the kernel file kernel/locking/mutex.c, which is explained in sections below.
① Analyze the first piece of code:

② Analyze the second piece of code:

③ Analyze the third piece of code:

This wait_list is FIFO (Firt In Firs Out), whoever queues up first can get the mutex first.

④ Analyze the fourth piece of code: for loop, this is the key point

⑤ Analyze the fifth piece of code: finishing work

1.7.3 Implementation of mutex_unlock function

There are also two paths (fast and slow) in the mutex_unlock function: if fastpath is successful, there is no need to use slowpath.
code show as below:

1.7.3.1 fastpath

Let’s take a look at the fastpath function: __mutex_fastpath_lock. This function is defined in the following 2 files:

include/asm-generic/mutex-xchg.h
include/asm-generic/mutex-dec.h

Which file to use? Take a look at arch/arm/include/asm/mutex.h, the contents are as follows:

#if __LINUX_ARM_ARCH__ < 6
#include <asm-generic/mutex-xchg.h>
#else
#include <asm-generic/mutex-dec.h>
#endif

Therefore, for architectures below ARMv6, use the __mutex_fastpath_lock function in include/asm-generic/mutex-xchg.h; for architectures ARMv6 and above, use the __mutex_fastpath_lock function in include/asm-generic/mutex-dec.h . The __mutex_fastpath_lock function in these two files is similar. The code in mutex-dec.h is as follows:

In most cases, the value of the mutex after adding 1 is 1, which means that no one is waiting for the mutex, so it is enough to directly increase the count value of the mutex to 1 through the fastpath function.
If the value of mutex is increased by 1,
If it is less than or equal to 0, it means that someone is waiting for the mutex and needs to go to wait_list to take it out and wake it up. This requires the use of the slowpath function: __mutex_unlock_slowpath.

1.7.3.2 slowpath

If the current value of mutex is 0 or a negative number, you need to call __mutex_unlock_slowpath to process slowly: other processes need to be awakened.

The __mutex_unlock_common_slowpath function code is as follows. The main job is to take out the first process from wait_list and wake up: