Thread status and thread safety issues

Article directory

- - Thread status
  - - All states of the thread
    - Thread state transition
  - Multi-threading vs. Single-thread Efficiency Improvement
  - Risks of Multithreading – Thread Safety Issues
  - - Case
    - analyze
    - reason
    - Solution
  - synchronized keyword – monitor lock
  - - Use the synchronized keyword
    - Features of synchronized
    - - mutually exclusive
      - reentrant
    - Thread-safe classes in the Java standard library
  - deadlock
  - - What is deadlock
    - Three typical deadlock cases
    - - ①
      - ②
      - ③
    - Necessary conditions for deadlock
    - - How to break deadlock

Thread status

In Java, the status of threads is subdivided

All states of threads

In the Thread class, the state of the thread is an enumeration type Thread.State. You can use the following code to print out all the states of the thread in Java.

public class ThreadDemo10 {<!-- -->
    public static void main(String[] args) {<!-- -->
        for (Thread.State state : Thread.State.values()) {<!-- -->
            System.out.println(state);
        }
    }
}

Print out the following results:

It can be seen that the thread has the following states:

NEW: The Thread object is created, but start has not been called yet (the corresponding PCB is not created in the kernel)
TERMINATED: Indicates that the PCB in the kernel has been executed, but the Thread object is still there
RUNNABLE: Runnable (being executed on the CPU and in the ready queue ready to be scheduled by the CPU)
WAITING: In a blocking state for different reasons
TIMED_WAITING: In blocking state for different reasons
BLOCKED: In a blocked state for different reasons

Thread state transition

You can get the state transition before and after the thread starts and ends through code:

public class ThreadDemo11 {<!-- -->
    public static void main(String[] args) throws InterruptedException {<!-- -->
        Thread t = new Thread(() -> {<!-- -->
            for (int i = 0; i < 100_0000; i + + ) {<!-- -->
                // Do nothing
            }
        });

        // Before starting, get the status of t (NEW)
        System.out.println("before start: " + t.getState());

        t.start();
        System.out.println("t status during execution: " + t.getState());
        t.join();

        //After the thread completes execution (TERMINATE)
        System.out.println("After t ends: " + t.getState());
    }
}

The following results are obtained:

View TIME_WAITING status (sleeping during thread implementation):

public class ThreadDemo11 {<!-- -->
    public static void main(String[] args) throws InterruptedException {<!-- -->
        Thread t = new Thread(() -> {<!-- -->
            for (int i = 0; i < 100_0000; i + + ) {<!-- -->
                try {<!-- -->
                    Thread.sleep(10);
                } catch (InterruptedException e) {<!-- -->
                    e.printStackTrace();
                }
            }
        });
        t.start();
        for (int i = 0; i < 1000; i + + ) {<!-- -->
            System.out.println("t status during execution: " + t.getState());
        }
        t.join();
    }
}

Got the following result:

It can be seen that RUNNABLE (not sleeping during execution) and TIME_WAITING (sleeping during execution) are printed alternately.

came to a conclusion:

The state before the thread is started (that is, the thread object has just been created and the PCB has not yet been created) is the NEW state.
When the thread is executing (sleep operation cannot be performed during the thread’s task), the state is RUNNABLE state.
The state during thread execution (when the thread is sleeping) is the TIME_WAITING state
After the thread is executed (the PCB is released, but the thread object is still (because the main thread has not yet ended)), the state is TERMINATED

Currently, only the above four states are introduced, and the remaining two states (WAITING / BLOCKED) will be introduced later.

expand:
What is the use of the TERMINATED status?

In fact, the TERMINATED status has no practical effect, it just serves as a marker.
In Java, the reason why the TERMINATED state exists is because of necessity. There is a rule for the life cycle of objects in Java, but this rule is not consistent with the threads in the system kernel. Therefore, when the threads in the kernel are released, there is no guarantee that the The Thread object in the code is also released immediately. Therefore, there is bound to be a situation when the PCB has been destroyed, but the object in Java still exists. At this time, a specific state is needed to identify the Thread object. becomes “invalid”, this is the purpose of the “TERMINATED” status

When a thread is marked in the TERMINATED state, it cannot be started again. (A thread can only be started once.)

Improvement of multi-threading efficiency compared to single-threading

The following cases are used to demonstrate the efficiency improvement of multi-threading compared to single-threading.

// Use this code to demonstrate the efficiency improvement between multi-threading and single-threading
public class ThreadDemo12 {<!-- -->
    // Serial execution (completed by one thread)
    public static long serial() {<!-- -->
        //Add a timing operation
        long begin = System.currentTimeMillis();

        long a = 0;
        for (long i = 0; i < 100_0000_0000L; i + + ) {<!-- -->
            a + + ;
        }

        long b = 0;
        for (long i = 0; i < 100_0000_0000L; i + + ) {<!-- -->
            b + + ;
        }

        long end = System.currentTimeMillis();
        return end - begin;
    }

    // Concurrent execution (completed by two threads)
    public static long concurrency() throws InterruptedException {<!-- -->
        // Use two threads to complete the self-increment separately
        Thread t1 = new Thread(() -> {<!-- -->
           long a = 0;
            for (long i = 0; i < 100_0000_0000L; i + + ) {<!-- -->
                a + + ;
            }
        });

        Thread t2 = new Thread(() -> {<!-- -->
            long b = 0;
            for (long i = 0; i < 100_0000_0000L; i + + ) {<!-- -->
                b + + ;
            }
        });
        // Timing
        long begin = System.currentTimeMillis();
        t1.start();
        t2.start();
        t1.join();
        t2.join();
        long end = System.currentTimeMillis();
        return end - begin;
    }

    public static void main(String[] args) throws InterruptedException {<!-- -->
        // Assume that there are currently two variables, and each of the two variables needs to be incremented 1000w times (typical CPU-intensive scenario)
        // It can be a thread, first increment for a, and then increment for b
        long serialTime = 0;
        // Take the average of three times
        for (int i = 0; i < 3; i + + ) {<!-- -->
            serialTime + = serial();
        }
        System.out.println("[serial] Single-threaded version execution time: " + serialTime/3 + " ms");
        // You can also use two threads to increment a and b respectively.

        long concurrencyTime = 0;
        // Take the average of three times
        for (int i = 0; i < 3; i + + ) {<!-- -->
            concurrencyTime + = concurrency();
        }
        System.out.println("[serial] Single-threaded version execution time: " + concurrencyTime/3 + " ms");
    }
}

got the answer:

It can be seen that the execution efficiency of multi-threading is higher than that of single-threading.

Risks of multi-threading – thread safety issues

The execution process of multi-threaded programs is: preemptive execution and random scheduling.
If it is not a multi-threaded program, there is only one order of execution of the code, and the order of the code is fixed, then the result of the program is fixed. However, in the environment of a multi-threaded program, the consequences of preemptive execution by multiple threads are : The order of code execution is not fixed, more variables will appear, and the final result will change from one situation to countless situations.
As long as there is a situation where the result of the code is incorrect, the program has a thread safety problem.

Case

Here is a classic case of thread safety issues
Create a Counter class with a member variable count and an auto-increment method add. Create two threads in the main method, call the add method 50,000 times respectively, and observe the final count result.

class Counter {<!-- -->
    public int count = 0;

    public void add() {<!-- -->
        count + + ;
    }
}

public class ThreadDemo13 {<!-- -->
    public static void main(String[] args) {<!-- -->
        Counter counter = new Counter();

        //Create two threads, and call the add method 50,000 times on counter respectively.
        Thread t1 = new Thread(() -> {<!-- -->
            for (int i = 0; i < 50000; i + + ) {<!-- -->
                counter.add();
            }
        });
        Thread t2 = new Thread(() -> {<!-- -->
            for (int i = 0; i < 50000; i + + ) {<!-- -->
                counter.add();
            }
        });
        //Start thread
        t1.start();
        t2.start();

        // Wait for both threads to finish
        try {<!-- -->
            t1.join();
            t2.join();
        } catch (InterruptedException e) {<!-- -->
            e.printStackTrace();
        }
        //Print the final count value
        System.out.println("count = " + counter.count);
    }
}

Run multiple times to see the final results:

The results of these three runs are all produced by running the same code, but the results are different, and the final correct result should be 10w, and these results are far from the correct results.

Analysis

Why does the above program produce unexpected results, and the results are not fixed every time it is run?

Let’s take a look at the count + + operation first:
According to previous learning, the operation of count + + is essentially divided into 3 steps:

First read the value in the memory into the CPU register. (load)
Perform + 1 operation on the value in the CPU register. (add)
Write the obtained results to memory. (sava)

These three operations are the three instructions executed on the CPU.

If two threads perform count++ operations concurrently, it is equivalent to two sets of load add save operations being performed at the same time. At this time, different thread scheduling orders will produce differences in results.

The picture above is one of the possibilities, but the threads are executed concurrently, and there may be other possibilities.

It can be seen that there are many possibilities for the order in which the instructions of the two threads perform the auto-increment operation, but basically there are thread safety issues.

Let’s take the situation in the figure below as an example to analyze the final result:

In the above figure, the two threads t1 and t2 each perform an auto-increment operation. Assume that the initial value of count is 0.

t1 first reads the count value in the memory into the register, that is, the count value in the register of the t1 thread is currently 0.
t2 also reads the count value in the memory into the register, that is, the count value in the register of the t2 thread is currently 0.
t2 performs an add operation and adds 1 to the count value in the register in t2. At this time, the count value in the t2 register is 1.
t2 performs a save operation and writes the count value in the t2 register back to the memory. At this time, the count value in the memory is 1.
t1 performs an add operation and adds 1 to the count value in the register in t1. At this time, the count value in the t1 register is 1.
t1 performs a save operation and writes the count value in the t1 register back to the memory. At this time, the final value in the memory is 1.

After the above two threads respectively perform count + + operations on the count value, a total of two times, the final correct result should be 2, but according to the analysis of the above situation, the result obtained is 1, which is not in line with expectations, so we get The conclusion is that the above situation has thread safety issues.

Reasons

Under what circumstances will thread safety issues arise?
The main reasons for thread safety are as follows:

[Root cause] Thread preemptive execution, random scheduling.
Code structure (multiple threads modify a variable at the same time)
Atomicity (if the modification operation is non-atomic, the probability of problems is very high)
memory visibility issue
Instruction reordering problem

The above are 5 typical reasons, not all of them. Whether a code is thread-safe or unsafe requires specific analysis of specific issues, and it is difficult to generalize.

Solution

How to start with atomicity to solve thread safety issues
You can use locking to convert a series of operations that are not atomic into “atomic” to solve thread safety issues.

To solve the non-atomicity problem of the add operation in the case, you can modify the Counter method.

synchronized public void add() {<!-- -->
        count + + ;
}

You can see that the final calculation result is correct:

Add a keyword synchronized to the add method. This keyword is a locking operation in Java.
After adding synchronized, the method will be locked when entering the method, and will be unlocked when exiting the method. If two threads try to lock at the same time, one can successfully acquire the lock, and the other can only enter the blocked waiting state (BLOCKED) and remain blocked. Only after the thread just released the lock (unlocked) can the current thread successfully lock.

We use the case we just analyzed to modify it to be thread-safe.

Locking is said to ensure atomicity, but its essence is not to allow the three operations here to be completed at one time, nor to perform different scheduling during these three steps of operation, but to allow other threads that also want to operate to block and wait. Add The essence of locks is to turn concurrency into serialization.

synchronized keyword – monitor lock

Use synchronized keyword

Using the synchronized keyword, there are several ways to use it.

Modification method

Modify ordinary methods (lock when entering the method, unlock when leaving the method)

public synchronized void methodond() {<!-- -->

}

Modify static methods

public synchronized static void method() {<!-- -->

}

(Although both are modified methods, synchronized locks are added in different places. Ordinary methods are added to this object, while static methods are added to class objects)

Modify the code block (lock when entering the code block, unlock when exiting the code block)

synchronized (this) {<!-- -->

}

Features of synchronized

Mutually exclusive

synchronized will have a mutually exclusive effect. When a thread executes the synchronized code block of an object, if other threads also execute the synchronized code block of the same object, blocking and waiting will occur, and only the previous thread will After unlocking, other threads can lock again.

Entering the synchronized code block is equivalent to locking
Exiting the synchronized code block is equivalent to unlocking

For each lock, the operating system provides a “blocking queue” internally. When the lock is occupied by a thread, if other threads try to lock it, they will not be able to add it. The operating system will put the thread into the blocking queue. , enters the blocking state until the previous thread is unlocked, and then the operating system wakes up one of the threads in the blocking queue to acquire the lock.

Reentrant

If a thread locks the same object twice in a row, if there is no problem, it is reentrant. If there is a problem, it is not reentrant.

synchronized public void add() {<!-- -->
synchronized (this) {<!-- -->
count + + ;
}
}

The synchronized code block is reentrant for the same thread, and there will be no problem of locking itself.

For non-reentrant locks, if a thread tries to lock again without releasing the lock, blocking wait will be triggered. The second lock cannot be acquired until the first lock is released, so the thread enters blocking. In the waiting state, but after entering the blocking state, the first lock cannot be unlocked, which results in a deadlock situation.

Thread-safe classes in Java standard library

Many threads in the Java standard library are thread-unsafe. These classes will involve multi-threads modifying shared data, but there are no locking measures.

ArrayList
LinkedList
HashMap
TheeMap
HashSet
TreeSet
StringBuilder

Some classes are thread-safe and are forcibly locked.

Vector (not recommended)
HashTable (not recommended)
ConcurrentHashMap
StringBuffer

There is also a class that is not locked, but does not involve modification, and is also thread-safe.

String

The above thread-safe classes, which have built-in synchronized, are relatively safer, but another impact of security is the loss of performance, which must be used reasonably in appropriate scenarios.

Deadlock

What is deadlock

Deadlock is a problem that may occur in multi-threaded programs. Its manifestation is: During the execution of the program, two or more threads wait for each other to release resources at the same time during the blocking process, resulting in These threads are blocked indefinitely, so that the program cannot run normally. Deadlock is a very serious problem in a program. Once it occurs, it will cause the program to fail to run.

Three typical deadlock cases

①

A thread holds a lock and locks it twice at the same time. If the lock is not reentrant, a deadlock will occur.

But it should be noted that: synchronized and ReentrantLock in Java are both reentrant locks, so using these two locks in Java can avoid this problem. But in Python, C++ and the operating system are native The locked APIs are all non-reentrant locks, so this situation may occur

②

Two threads have two locks, t1 and t2 each first lock lock A and lock B, and then try to acquire the other party’s lock.

Code example:

public class ThreadDemo14 {<!-- -->
    public static void main(String[] args) {<!-- -->
        Object locker1 = new Object();
        Object locker2 = new Object();

        Thread t1 = new Thread(() -> {<!-- -->
            synchronized (locker1) {<!-- -->
                try {<!-- -->
                    Thread.sleep(1000);
                } catch (InterruptedException e) {<!-- -->
                    throw new RuntimeException(e);
                }
                synchronized (locker2) {<!-- -->
                    System.out.println("t1 acquired two locks");
                }
            }
        });
        Thread t2 = new Thread(() -> {<!-- -->
            synchronized (locker2) {<!-- -->
                try {<!-- -->
                    Thread.sleep(1000);
                } catch (InterruptedException e) {<!-- -->
                    throw new RuntimeException(e);
                }
                synchronized (locker1) {<!-- -->
                    System.out.println("t2 acquired two locks");
                }
            }
        });
        t1.start();
        t2.start();
    }
}

It can be observed that there is no output in the console, indicating that no thread has obtained two locks. Both threads t1 and t2 are blocked at the same time, waiting for the other party to release resources, so a deadlock occurs. This phenomenon.

③

Multiple threads, multiple locks (this is the general case of ②)

Here is a classic case: The Dining Philosopher Problem, to better describe this situation

As shown in the picture: There are five philosophers around a table. Each philosopher has a chopstick on his left and right hands. If each philosopher wants to eat the spaghetti in the middle, he needs to pick up the left and right chopsticks at the same time. You can only eat with a pair of chopsticks.

Each philosopher is equivalent to a thread, they will only have two states

Do nothing (thread blocking state)
Eat spaghetti (after the thread acquires the lock and executes it)

Due to the random scheduling of the operating system, these philosophers are expected to be eating Italian food at any time, or they may be doing nothing at all.

Suppose there is an extreme situation:

As shown in the picture above, each philosopher picked up the chopsticks on the right at the same time. At this time, all the chopsticks on the table were picked up, but when each philosopher wanted to eat pasta, he had to wait for the left hand The person next to him puts down the chopsticks on his left hand, which creates a waiting cycle. Everyone holds a chopstick, and everyone is waiting for the other person to put down the chopsticks in their hands, which leads to freezing situation. This is a typical deadlock problem.

Necessary conditions for deadlock

Multiple threads seize multiple locks. This situation is very complicated. We need to first understand the necessary conditions for deadlock to occur.

Use of mutual exclusion: When one thread acquires the lock, if other threads want to acquire the lock, they must enter a blocking and waiting state.

Non-preemptible: A thread acquires a lock. Unless this thread actively releases the lock, other threads cannot forcibly acquire the lock.

Request and hold: After a thread acquires a lock, and then tries to acquire another lock, the first lock is still held by this thread, and the first lock will not be given away just to acquire another lock. Released.

Loop waiting: Thread 1 tries to acquire lock A and lock B, thread 2 tries to acquire lock B and lock A. Thread 1 waits for thread 2 to release lock B when acquiring lock B, and thread 2 waits when acquiring lock A. Thread 1 releases lock A.

How to break deadlock

In Java, among the necessary conditions for deadlock, synchronized is its basic feature and cannot be broken. If you want to break the deadlock, you must start with the fourth point of loop waiting.

The simplest way is to number the locks. Then during the locking process, you need to add locks in a fixed order. For example, acquire the lock with the smaller number first, then acquire the lock with the larger number. Or acquire the lock with the larger number first. , and then acquire the lock with the smaller number…

Based on the philosophers’ dining problem above, we number each chopstick. Let each philosopher pick up the chopsticks with the smaller number first, and then the chopsticks with the larger number. Assume that the five philosophers still pick up the chopsticks at the same time. Then the situation will be as follows:

Philosopher No. 1 takes Chopstick No. 1 first, Philosopher No. 2 takes Chopstick No. 2 first… When Philosopher No. 5 arrives, he needs to take Chopstick No. 1. But Chopstick No. 1 is already held by Philosopher No. 1. Then he needs to enter a blocking waiting state. At this time, philosopher No. 4 can obtain chopsticks No. 5, and after eating the pasta, he puts down chopsticks No. 5 and No. 4, and philosopher No. 3 can obtain them. Philosopher No. 4 puts down Chopstick No. 4… When Philosopher No. 1 gets Chopsticks No. 2 and No. 1 and finishes eating the pasta, he puts down Chopstick No. 1. At this time, Philosopher No. 5 can get Philosopher No. 5. No. 1 chopsticks and started eating pasta.
The entire loop waiting process is broken
Then according to this method, you can modify the code in ② to break the loop waiting process.

public class ThreadDemo14 { public static void main(String[] args) { Object locker1 = new Object(); Object locker2 = new Object(); Thread t1 = new Thread(() -> { synchronized (locker1) { try { Thread.sleep(1000); } catch (InterruptedException e) { throw new RuntimeException(e); } synchronized (locker2) { System.out.println("t1 acquired two locks"); } } }); Thread t2 = new Thread(() -> { synchronized (locker1) { try { Thread.sleep(1000); } catch (InterruptedException e) { throw new RuntimeException(e); } synchronized (locker2) { System.out.println("t2 acquired two locks"); } } }); t1.start(); t2.start(); } }

The changes in the above code compared to the previous deadlock situation are: both threads first obtain locker1, and then obtain locker2, so that the previous loop waiting process can be broken.