JavaEE–Multithreading (Continued) Security Issues

Multithreading continued

  • 1. Common properties of Thread
  • Second, interrupt a thread
    • create variable method
    • built-in flag method
  • 3. Waiting for a thread
  • Fourth, the state of the thread
  • 5. Thread safety
    • 1. Preemptive execution
    • 2. Multiple threads modify the same variable
    • 3. Modification operations are not atomic
    • 4. Memory visibility
    • 5. Instruction reordering
  • Sixth, solve the problem of thread safety
    • 1. Lock
    • 2. The difference from the join operation
    • 3. Summary

In the last blog, I wrote about how computers work and the difference between multi-process and multi-thread. Finally, threads are created with multiple methods of Thread, among which the lambada expression is the most important
Previous blog link: link: link

1. Thread common attributes

start method: actually create a thread from the system, and the new thread will execute the run method.
run method: Indicates what the entry method of the thread is. When the thread is started, what logic should be executed; (not for the programmer to call, but for the system to call automatically).
These two methods are also classic interview questions.

2. Interrupt a thread

To interrupt a thread is to stop a thread. In essence, there is only one method to terminate a thread, and let the thread’s entry method (run method) be executed.

Create variable method

The while(true) in the above code creates an infinite loop, which causes the entry method to fail to execute, and naturally the thread cannot end. So we control the loop condition with a variable. code show as below:

public class ThreadDemo9 {<!-- -->
    public static boolean isQuit = false;

    public static void main(String[] args) {<!-- -->
        Thread t = new Thread( () -> {<!-- -->
            while (!isQuit) {<!-- --> // while(true) is an infinite loop, which causes the entry method to fail to execute, and naturally cannot end the thread
                System.out.println("hello t");
                try {<!-- -->
                    Thread. sleep(1000);
                } catch (InterruptedException e) {<!-- -->
                    e.printStackTrace();
                }
            }
            System.out.println("t thread terminated");
        });

        t. start();

        // In the main thread, modify isQuit
        try {<!-- -->
            Thread. sleep(3000);
        } catch (InterruptedException e) {<!-- -->
            e.printStackTrace();
        }
        isQuit = true;
    }
}

The result of the operation is as follows:

After three seconds isQuit becomes true and the thread ends.
Question, if the isQuit is changed from a member variable to a local variable main, will the code work normally?
The answer is no. First of all, lambada expressions can access external local variables. But the syntax rules for variable capture must be followed. Java requires variable capture, and the captured variable must be final or “actually final” (the variable is not decorated with final, but the code has not been modified). Here, boolean is written into the mian method. Next, we will modify siQuit to true, because it does not conform to the principle of variable capture, so it will not work.

Built-in flag method

In fact, the Thread class has a built-in flag that allows us to achieve the above effect more conveniently. The code is as follows:

public class ThreadDemo10 {<!-- -->
    public static void main(String[] args) {<!-- -->
        Thread t = new Thread(() -> {<!-- -->
            // currentThread is to get the current thread instance
            // The object obtained by currentThread here is t, and t can also be directly replaced
            // isInterrupted is a flag that comes with the t object
            while (!Thread.currentThread().isInterrupted()) {<!-- -->
                System.out.println("hello t");
                try {<!-- -->
                    Thread. sleep(1000);
                } catch (InterruptedException e) {<!-- -->
                    e.printStackTrace();
                }
            }
        });

        t. start();

        try {<!-- -->
            Thread. sleep(3000);
        } catch (InterruptedException e) {<!-- -->
            e.printStackTrace(); //Print out the call stack of the current exception location
        }

        // Set the internal flag of t to true, the default isInterrupt is false state
        t. interrupt();
    }
}

The result of the operation is as follows:

At this time, we found that when the 3s time was up, when the t.interrupt method was called, the thread did not really end, but an exception message was printed, and the execution continued. This exception is called by e.printStackTrace in the catch in the code.

(1) The function of the interrupt method:
1. Set the flag bit to true.
2. If the thread is blocking (such as executing sleep), it will wake up the blocking state at this time, and let the sleep end by throwing an exception.
Attention!
When sleep is woken up, sleep will automatically clear the isInterrupted flag. (true->false).
(Original isInterrupted is uninterrupted false, calling interrupted becomes interrupted true, and then sleep is awakened in advance because it has been interrupted. Once sleep is awakened in advance, sleep will set the isInterrupted flag from true to false.) This It leads to the next cycle, and the cycle can still continue to execute.
(2) Why does the next cycle not throw an exception?
When sleep is executed for the first time, the flag is cleared and an exception is thrown. When sleep is executed for the second time, there is no interrupt flag (the main thread is not repeatedly set in a loop, but only executed once).

As shown in the figure: the isInterrupted flag starts to be false

If the flag bit is false when sleep is executed, sleep will sleep normally. If the current position is true, sleep will trigger two things no matter whether it has just been executed or half of it has been executed. 1. An exception is thrown immediately. 2. Clear the flag bit to false. In the next cycle, until sleep, because the current flag itself is false, no operation will be performed.
If when interrupt is set, sleep just wakes up, it is a coincidence that the execution to the next round of loop conditions will end directly (but this probability is very low, after all, the sleep time occupies 99.9999% of the entire loop body…% time).
(3) The main thread only calls once
This is a multi-threaded code, the execution order of multi-threaded code is not from top to bottom, but each thread is executed independently. When the start method is executed, the code is divided into two paths, the main thread continues to execute, and the new thread enters the run method for execution. The two are concurrent (concurrent + parallel) execution. The main thread continues to go after executing the interrupt, and will not execute the interrupt again.

Interrupt is finished after executing the main thread, and there is nothing else behind the main method, but whether t ends or not depends on the content in t.
(4) Why sleep should clear the flag
The purpose is to allow the thread itself to have a clear control over when the current car will end. The current interrupt method is not effective because it does not let the thread end immediately. Twenty tells it that it is time to end. As for whether it really wants to end immediately or wait It will end, and it is all controlled by the code flexibly. Interrupt is just a notification, not a command. When the thread ends, it is up to t to decide by itself.
(5) Modify to end the cycle
If you need the loop to end, you have to make a break in the catch. As shown in the picture:

3. Waiting for a thread

Threads are executed concurrently, and the operating system schedules threads out of order. It is impossible to determine which of the two threads will finish executing first and which will finish later. As shown below:

It cannot be determined whether to output hello main or hello t first. When this code is executed, hello main is output first in most cases (because thread creation also has overhead), but it does not rule out certain cases. The main thread hello main is not executed immediately. At this time, you can use thread waiting to achieve. (join method):

public class ThreadDemo11 {<!-- -->
    public static void main(String[] args) throws InterruptedException {<!-- -->
        Thread t = new Thread(() -> {<!-- -->
            System.out.println("hello t");
        });

        t. start();

        Thread. sleep(1000);

        t.join();// Calling t.join in the main thread means to let the main thread wait for t to end first, and then execute

        System.out.println("hello main");
    }
}

(1) When t.join is executed, if the t thread has not ended, the main thread will block and wait. The code stops when it reaches this line, and the current thread does not participate in the scheduling and execution of the cpu for the time being.
(2) t.join is in the main thread, calling t.join means to let the main thread wait for t to end first, and then execute down, and other threads will not be affected (if it is in the t1 thread, calling t2.join means It is to let the t1 thread wait for the t2 thread to end, t1 enters blocking, and other threads are scheduled normally.
(3) To sum up, there are two points: 1. When the main thread calls t.join, if t is still in Ginkgo, the main thread is blocked at this time, and the main thread is released from the block until the execution of t is completed (the run of t is executed). before proceeding. 2. When the main thread calls t.join, if t has ended, the join will not be blocked at this time, and it will be executed immediately.

4. Thread status

The thread in the operating system itself has a state. But Java Thread is an encapsulation of system threads, which further refines the state here.

NEW: The thread in the system has not been created yet, but there is only a Thread object.
TERMINATED: The thread in the system has finished executing, the Thread object is still there, and the work is completed.
RUNNABLE: Ready state: 1. Running on the cpu. 2. Be ready to run on the CPU at any time.
TIME_WAITING: Specifies the time to wait for the .sleep method.
BLOCKED: Indicates the status of waiting for the lock to appear.
WAITING: The state that occurs using the wait method.


Code as an example:

public class ThreadDemo12 {<!-- -->
    public static void main(String[] args) throws InterruptedException {<!-- -->
        Thread t = new Thread(() -> {<!-- -->
            while (true) {<!-- -->
                // To prevent hello from flushing the thread away and not seeing the status, comment it out first
                // System.out.println("hello");
                try {<!-- -->
                    Thread. sleep(1000);
                } catch (InterruptedException e) {<!-- -->
                    e.printStackTrace();
                }
            }
        });

        // Before starting, get the thread status, NEW
        System.out.println(t.getState());

        t. start();

        Thread. sleep(2000);
        System.out.println(t.getState());
    }
}

5. Thread safety

1. Preemptive execution

A certain code, executed in a multi-threaded environment, will have bugs, that is, the thread is not safe because the scheduling sequence between threads is essentially uncertain. Let’s give an example first, look at the following code:

class Counter {<!-- -->
    private int count;

    public void add() {<!-- -->
        count + + ;
    }

    public int get() {<!-- -->
        return count;
    }
}

// unsafe thread
public class ThreadDemo13 {<!-- -->
    public static void main(String[] args) throws InterruptedException {<!-- -->
        Counter counter = new Counter();
        // Create two threads, and the two threads will increment the counter by 5w times respectively
        Thread t1 = new Thread(() -> {<!-- -->
            for (int i = 0; i < 50000; i ++ ) {<!-- -->
                counter. add();
            }
        });

        Thread t2 = new Thread(() -> {<!-- -->
            for (int i = 0; i < 50000; i ++ ) {<!-- -->
                counter. add();
            }
        });

        t1. start();
        t2.start();

        t1. join();
        t2. join();

        System.out.println(counter.get());
    }
}

This code is two threads targeting the same variable, each incrementing 5w times. The expected result should be 10w times, but the actual result is indeed a random value, and the result is different every time. (The actual result does not match the expected result, which is a bug, and the bug caused by multi-threading -> thread is not safe.) The running results are as follows:



All three runs had different results. Why is there such a reason? This is closely related to the randomness of thread scheduling.

The count ++ operation is essentially composed of three cpu instructions:
1. load: read the data in the memory into the cpu register.
2. add: It is to perform + 1 operation on the value in the register.
3. save, write the value in the register back to the memory.

Since the scheduling of multiple threads is uncertain, in the actual execution process, there are many possibilities for the actual order of the instructions of the + + operation given by the two existing vehicles!

This kind of load of t1 and t2, then add, and finally save run at the same time.

… in many permutations and combinations



From these types, we can know that the scheduling order of the two threads is out of order. During the running process, we don’t know what the two threads have experienced during the self-increment process, how many times are “sequential execution”, and how many times are “interleaved”. Execution” does not know, and the results obtained also vary.
The result of this code must be <=10w.

Summary: In the final analysis, the problem of thread safety is all due to the out-of-order scheduling of threads, which leads to uncertain execution order, and the result changes. That is, the scheduling of threads in the system is out of order/random (preemptive execution)

2. Multiple threads modify the same variable

Review of the last blog:
A thread modifies the same variable -> safe
Multiple threads read the same variable -> safe
Multiple threads modify different variables -> security

3. The modification operation is not atomic

Atom refers to the smallest unit that cannot be divided. The above + operation is not atomic, and it can be split into three operations => load, add, save. An operation, corresponding to a single cpu instruction, is atomic. If this operation corresponds to multiple cpu instructions, there is a high probability that it is not atomic.
(Because it is not atomic, there are more variables in the instruction arrangement of the two threads).
If you use = assignment directly, it is an atomic operation.

4. Memory visibility

Let’s write a bug first (actual effect! = expected effect):

 public static int flag = 0;

    public static void main(String[] args) {<!-- -->
        Thread t1 = new Thread(() -> {<!-- -->
           while (flag == 0) {<!-- -->
               // empty
           }
            System.out.println("Loop end! t1 end!");
        });

        Thread t2 = new Thread(() -> {<!-- -->
            Scanner scanner = new Scanner(System.in);
            System.out.println("Please enter an integer:");
            flag = scanner. nextInt();
        });

        t1. start();
        t2.start();
    }
}

This code works as expected:
t1 loops through flag == 0 as the condition, and the initial situation will enter the loop. t2 inputs an integer through the console. Once the user enters a non-zero value, the loop of t1 will end immediately, and the t1 thread will exit!
But the actual effect is:
After entering a non-zero value, the t1 thread does not exit, and the loop does not end. You can see that the t1 thread is still executing through jconsole and is still in the RUNNABLE state.
This is the insecurity problem caused by memory visibility. analyse as below:
Let’s look at this piece of code:

The following two operations will be performed:
1. load: read from memory to cpu register
2. cmp: compare whether the value in the register is 0.
Notice! For these two operations, the time overhead of load is much higher than that of cmp. Although reading memory is faster (thousands of times) than reading hard disks, reading registers is faster (thousands of times) than reading memory.
The compiler found that the load cost is very high, and the result of each load is the same (the flag is not modified in this code). At this time, the compiler made a bold operation! => optimize the load (removed) only the first execution of the load is actually executed, and the subsequent ones are only cmp, not load (equivalent to the loaded value in the multiplexed register)
Therefore, for this processing method, let the compiler suspend optimization for this scenario: use variables modified by volatile, and the compiler will prohibit the above optimization at this time, which can ensure that the data is re-read from the memory every time. Right now

import java.util.Scanner;

public class ThreadDemo14 {<!-- -->
    volatile public static int flag = 0;// Solve the compiler optimization problem through volatile

    public static void main(String[] args) {<!-- -->
        Thread t1 = new Thread(() -> {<!-- -->
           while (flag == 0) {<!-- -->
               // empty
           }
            System.out.println("Loop end! t1 end!");
        });

        Thread t2 = new Thread(() -> {<!-- -->
            Scanner scanner = new Scanner(System.in);
            System.out.println("Please enter an integer:");
            flag = scanner. nextInt();
        });

        t1. start();
        t2.start();
    }
}

After adding the volatile keyword, the compiler can ensure that the value of the flag variable is re-read from memory every time. At this time, t2 modifies the flag, t1 can immediately perceive it, and t1 can exit correctly.
Notice:

Volatile does not guarantee atomicity, and the practical scenario of volatile is a situation where one thread reads and one thread writes.
Synchronized is the writing of multiple threads

5. Command reordering

Volatile also has the effect of prohibiting instruction reordering. Instruction reordering is also a strategy for compiler optimization. Adjust the order of code execution to make the program more efficient. The premise is to ensure that the overall logic remains unchanged.

If it is a single-threaded environment, instruction reordering can be performed here. 1 instruction must be executed first, and whoever executes 2 and 3 first or whoever executes later is fine.
But under multithreading:
Assuming that t1 is executed in the order of 1 3 2, when t1 executes 1 3 and is about to execute 2:
t2 starts to execute. Since 3 of t1 has been executed, the reference is no longer empty, t2 tries to call s.learn(), but t1 has not called 2 (that is, it has not been initialized), and the learn at this time will change. If you don’t know what it is, bugs are likely to occur.

6. Solve thread safety issues

The original intention of multi-threading: to perform concurrent programming and make better use of multi-core CPUs. If you want to solve thread insecurity, you need to start with the reasons. That is how to make count ++ become atomic. – lock
There are two core operations of locks:
(1) lock
(2) Unlock
Once a certain thread is locked, other threads also want to lock it, so they cannot go directly. They need to block and wait until the thread that gets the lock releases the lock.

Thread scheduling is executed preemptively. When No. 1 old iron releases the lock, it is uncertain who can get the lock first and wait for No. 2 and No. 3 to successfully lock it.

1. Lock

In Java, we use the keyword synchronized to achieve the locking effect.

class Counter {<!-- -->
    private int count;

    // lock method one
    /*public void add() {
        synchronized (this) { // this is equivalent to conter
            count + + ;
        }
    }*/

    // lock method 2
    synchronized public void add() {<!-- -->
        count + + ;
    }

    public int get() {<!-- -->
        return count;
    }
}

Here two methods are used to implement the locking operation.
Locks have two core operations, locking and unlocking:
(1) When entering a synchronized modified code block, the lock will be triggered.
(2) When the synchronized code block is out, the unlocking will be triggered.
Let’s look at this code again:


1. If two objects lock the same object, “lock competition” will occur at this time (one thread gets the lock first, and the other thread blocks and waits).
2. If two threads lock different objects, there will be no lock competition at this time, and each can acquire its own lock.
3. The lock object in ( ) can be written as any Object object (the built-in type is not acceptable). The this in our code here is equivalent to: Counter counter = the counter object in new Counter().
Let’s talk about whether multiple threads are competing for the same lock object.

//Create two threads, and the two threads will increment the counter by 5w times
        Thread t1 = new Thread(() -> {<!-- -->
            for (int i = 0; i < 50000; i ++ ) {<!-- -->
                counter. add();
            }
        });

        Thread t2 = new Thread(() -> {<!-- -->
            for (int i = 0; i < 50000; i ++ ) {<!-- -->
                counter. add();
            }
        });

In the above code, the two objects actually compete for the same lock object (counter), and lock competition will occur at this time (t1 gets the lock, t2 has to block). At this point, it can be guaranteed that the ++ operation is atomic and will not be affected.
as the picture shows:

2, and the difference between the join operation

There is no practice in locking and joining operations. Join just allows two threads to completely serialize. Locking means that a small part of the two threads is serialized. , mostly concurrently.

Thread t1 = new Thread(() -> {<!-- -->
            for (int i = 0; i < 50000; i ++ ) {<!-- -->
                counter. add();
            }
        });

        Thread t2 = new Thread(() -> {<!-- -->
            for (int i = 0; i < 50000; i ++ ) {<!-- -->
                counter. add();
            }
        });

Still looking at the code above, the work of a thread is roughly as follows:
1. Create i
2. Determine if i < 50000
3. Call add
4. count ++
5. add return
6. i++
Of these steps, only 4 are serial, and 12356 of the two threads are still concurrent. Therefore, locking is to make the code run faster and make better use of multi-core cpu under the premise of ensuring thread safety.

In fact, in any case, locking may cause blocking, and code blocking will definitely affect the efficiency of the program. Although it is locked here, it is slower than without lock, but it is still faster than serial, and at the same time, the calculation is more accurate than without lock.

synchronized public void add() {<!-- -->
        count + + ;
    }

This method is modified with synchronized, which is equivalent to using this as the lock object.

synchronized public static void test() {<!-- -->
}

If the synchronized modification method (static) does not lock this at this time, but locks the class object.

private Object locker = new Object();
    
    public void add() {<!-- -->
        synchronized (locker) {<!-- -->
            count + + ;
        }
    }


You can write anything here, as long as it is an instance of Object (built-in types are not acceptable)

3. Summary

keep in mind!

1. If multiple threads try to lock the same lock object, lock competition will occur at this time, and locks for different objects will not have lock competition.
2. Since the lock is used, the purpose is to ensure thread safety. Using different lock objects, it is difficult to guarantee the atomicity just mentioned without competition.

Multithreading still has a lot to learn, and the multithreading series will continue to be updated. Finally our flower princess finale