[Multithreading (lock strategy and CAS)]

Article directory

foreword
1. Common locking strategies
- 1.1 Optimistic lock and pessimistic lock
- 1.2 Read-write lock ((readers-writer lock)
- 1.3 Heavyweight locks vs lightweight locks
- 1.4 Spin lock and pending lock
- 1.5 Fair lock vs unfair lock
- 1.6 Reentrant locks
2. CAS
- 2.1 What is CAS
- 2.2 Implementation of spin lock based on CAS
- 2.3 What is the ABA problem in CAS
- 2.4 ABA problem solution

Foreword

In multithreading, thread safety issues are usually solved by locking, so we must understand common locking strategies. To know what lock strategies are there and what are their differences, these lock strategies will be introduced next, and of course there are some lock application scenarios.

1. Common locking strategies

1.1 Optimistic lock and pessimistic lock

Pessimistic lock

Pessimistic locks are more pessimistic. It believes that the probability of multiple threads modifying shared resources at the same time is relatively high, so conflicts are prone to occur. Therefore, before accessing shared resources, lock them first.

Optimistic locking

Optimistic locking works more optimistically. It assumes that the probability of conflict is very low. Its working method is: first modify the shared resource, and then verify whether there is any conflict during this period. If no other thread is modifying the resource, the operation is completed. If If it is found that other threads have modified this resource, give up this operation

For example: Classmate A and Classmate B want to ask the teacher a question

Classmate A thinks that “teacher is busy, I come to ask questions, and the teacher may not be free to answer”. Therefore, classmate A will send a message to the teacher first: “Teacher, are you busy? I can come to you to ask a question at 2 o’clock in the afternoon What?” (equivalent to locking operation) Only after getting a positive answer will you really ask the question. If you get a negative answer, then wait for a while and come back to confirm the time with the teacher next time. This is a pessimistic lock.

Classmate B thinks that “the teacher is relatively free, I come to ask questions, and the teacher is likely to be free to answer.” Therefore, classmate B directly comes to the teacher. (No lock, direct access to resources) If the teacher is really free, then The direct problem is solved. If the teacher is really busy, then classmate B will not disturb the teacher, and will come back next time (although there is no lock, but the data access conflict can be identified). This is an optimistic lock.

Synchronized initially uses an optimistic locking strategy. When lock competition is found to be frequent, it will automatically switch to a pessimistic locking strategy.
It’s like classmate C who started to think that “the teacher is relatively free”, and would go to the teacher directly to ask questions. But after coming to the teacher directly twice, he found that the teacher is very busy, so next time he comes to ask questions, he will send a message to ask Ask the teacher if he is busy, and then decide whether to ask questions

1.2 Read-write lock ((readers-writer lock)

Between multiple threads, there will be no thread safety issues between data readers, but data writers need mutual exclusion between each other and readers. If the same lock is used in both scenarios, a huge performance loss will occur. Therefore, the read-write lock is generated.

We can also know the literal meaning of the read-write lock. It consists of two parts: “read lock” and “write lock”. If you only read shared resources, use “read lock” to lock them. If you want to modify shared resources, use “Write lock” is locked.

Therefore, read-write locks are suitable for scenarios where read operations and write operations can be clearly distinguished.

There are two main operations for a thread to access data: read data and write data.

Both threads are just reading one piece of data, so there is no thread safety issue at this time. Just read concurrently directly.
Two threads have to write a data, there are thread safety issues. (data loss)
One thread reads and another thread writes, and there are also thread safety issues. (Dirty reads and other issues)

According to the above, we can conclude

There is no mutual exclusion between read lock and read lock.
Between write lock and write lock, mutual exclusion.
Mutual exclusion between read lock and write lock.

Note that as long as “mutual exclusion” is involved, the thread will be suspended and waited. Once the thread is suspended, it is not known how long it has been since it is woken up again.
It has been a long time. Therefore, reducing the chance of “mutual exclusion” as much as possible is an important way to improve efficiency.

Synchronized is not a read-write lock

1.3 Heavyweight locks vs lightweight locks

The core feature of the lock is “atomicity”. Such a mechanism is traced back to the hardware device such as the CPU.

The CPU provides “atomic instructions”.
The operating system is based on the CPU’s atomic instructions, implemented the mutex mutual exclusion lock.
Based on the mutex lock provided by the operating system, the JVM implements keywords and classes such as synchronized and ReentrantLock.

Note that synchronized does not just encapsulate mutex, but also does a lot of other work inside synchronized

Heavyweight lock: The locking mechanism relies heavily on the mutex provided by the OS-a lot of things to do and a lot of overhead

A large number of kernel mode user mode switching

It is easy to cause thread scheduling

Lightweight lock: The locking mechanism does not use mutex as much as possible, but tries to complete it in user mode code. If it is really uncertain, use mutex again.

A small amount of kernel mode user mode switching.

It is not easy to cause thread scheduling.

Synchronized is a lightweight lock at the beginning. If the lock conflict is serious, it will become a heavyweight lock.

1.4 Spin lock and pending lock

Spin lock is a typical implementation of lightweight lock

Hanging and waiting lock is a typical implementation of heavyweight lock

According to the previous method, the thread enters the blocked state after the lock grab fails, giving up the CPU, and it will take a long time to be scheduled again, but in fact, in most cases, although the current lock grab fails, the lock will be blocked after a long time freed. There is no need to give up the CPU. At this time, spin locks can be used to deal with such problems.
If the lock acquisition fails, try to acquire the lock again immediately, and loop infinitely until the lock is acquired. The first attempt to acquire the lock fails, and the second attempt will come in a very short time. Once the lock is released by other threads , you can get the lock in the first time.

Understanding spin locks vs pending wait locks

For example: classmate A and classmate B want to ask the teacher a question

Classmate A came to the office to ask the teacher a question, but he found that someone was already asking, so he waited next to him, and waited until the person asked the question (this is the spin lock), and did not give up looking for a teacher and being a teacher After solving (probably less than ten minutes), it is classmate A’s turn

Classmate B came to the office to ask the teacher a question, but he found that someone was already asking the question, so he went back to the dormitory (hanging the waiting lock), asked again after a few hours, and found that no one asked the teacher question ok, just ask your own questions

Advantages and disadvantages of spin locks.

Advantages: No giving up the CPU, no thread blocking and scheduling involved, once the lock is released, the lock can be acquired at the first time.

Disadvantage: If the lock is held by other threads for a long time, it will continue to consume CPU resources. (And it does not consume CPU when it is suspended and waiting).

The lightweight lock strategy in synchronized is most likely implemented through spin locks.

1.5 Fair lock vs unfair lock

Suppose there are three threads A, B, and C. Thread A first tries to acquire the lock, and the acquisition succeeds. Then B tries to acquire the lock, fails, and blocks waiting; then C also tries to acquire the lock, and C also acquires it Failed, also blocked waiting. When thread A releases the lock, what happens?

Fair lock: Obey “first come, first served”. B comes first before C. When A releases the lock, B can acquire the lock before C.

Unfair lock: Does not obey “first come, first served”. Both B and C may acquire the lock.

The thread scheduling inside the operating system can be regarded as random. If no additional restrictions are made, the lock is an unfair lock. If you want
To achieve fair locks, you need to rely on additional data structures to record the order of threads.

synchronized is an unfair lock.

1.6 Reentrant lock

Reentrant lock means that the same thread can acquire the same lock multiple times without being blocked, that is, the thread can repeatedly acquire the lock it already holds without worrying about deadlock

For example, if there is a lock operation in a recursive function, will the lock block itself during the recursive process? If not, then the lock is a reentrant lock (reentrant locks are also called recursive locks for this reason).

All locks named starting with Reentrant in Java are reentrant locks, and all ready-made Lock implementation classes provided by JDK include
The synchronized keyword locks are all reentrant.

2. CAS

2.1 What is CAS

CAS is the abbreviation of compare and swap, which is what we call comparison and swap. Cas is a lock-based operation, and it is an optimistic lock.

A CAS operation has three operands – the memory location (V), the expected old value (A), and the new value (B). If the value in the memory address is the same as the value of A, then update the value in the memory to B. CAS acquires data through an infinite loop. If in the first round of loop, the value in the address obtained by thread a is modified by thread b, then thread a needs to spin, and it may have a chance to execute in the next loop.

Pseudocode: The code written below is not atomic, and the real CAS is completed by an atomic hardware instruction. This pseudocode is just to help understand the workflow of CAS.

boolean CAS(address, expectValue, swapValue) { if ( & amp;address == expectedValue) { //Compare V with A & amp;address = swapValue; //B writes into A return true; } return false; }

Disadvantage: CAS causes increased CPU utilization. As mentioned before, CAS is a cyclic judgment process. If the thread has not obtained the status, the CPU resources will always be occupied.

2.2 Implementation of spin lock based on CAS

Implement more flexible locks based on CAS and gain more control rights.

Spinlock Pseudocode

public class SpinLock { private Thread owner = null; public void lock(){ // Check whether the current lock is held by a thread through CAS. // If the lock is already held by another thread, then spin and wait. // If the lock is not held by another thread, set the owner to the thread currently trying to lock it. // keep trying to add locks in while while(!CAS(this.owner, null, Thread.currentThread())){ } } public void unlock (){ this.owner = null; } }

2.3 What is the ABA problem in CAS

CAS is prone to ABA problems. A thread a changes the value to b, and then changes it to a. At this time, CAS thinks that there is no change, but in fact it has changed

for example:
When you go shopping online, you only have a total of 100 yuan. When you are about to pay, because your phone freezes, you press twice to confirm the payment and deduct 50. Two threads are generated, and the money will be deducted. Originally, there was only one CAS The operation can be successful, but in the second deduction, if your friend transfers 50 yuan to you at this time, because of the ABA problem, the deduction will also be successful; as shown below.

Since threads 1 and 2 are generated to deduct 50 yuan from 100 yuan, because of ABA problems, the deduction is twice

2.4 ABA Problem Solutions

For the value to be modified, introduce the version number. When CAS compares the current value of the data with the old value, it also compares whether the version number is as expected.

While reading the old value, the CAS operation also reads the version

When actually modifying, if the current version number is the same as the read version number, modify the data and add the version number + 1.
If the current version number is higher than the read version number, the operation fails (the data is considered to have been modified).

As shown below: