synchronized optimization principle

Article directory

- 1. Underlying principles
- 2. Optimization plan one: lightweight lock
- - (1) Lightweight lock workflow
  - (2) Lock expansion
- 3. Optimization plan two: spin optimization
- 4. Optimization plan three: bias lock
- - (1) Biased state
  - (2) Batch weight bias
  - (3) Batch re-direction cancellation
- 5. Optimization plan four: lock elimination

1. Underlying principles

First of all, we need to know the underlying principle of synchronized, a heavyweight lock. Synchronized is a kind of object lock, and the object it locks is an object instance of a certain class or a certain class. In the JVM, there is a monitor keyword in the object header of each class instance object, which is the so-called monitor. Obtaining a lock is equivalent to obtaining the monitor of an object, and each object has only one monitor. There is no The thread that acquires the lock will be blocked by the operating system. The blocked thread will be put into an Entrylist collection of the monitor. And we need to know that the monitor is owned by the operating system, so the cost of using it is very high, so we need to optimize synchronized.

2. Optimization plan one: lightweight lock

(1) Lightweight lock workflow

Usage scenarios: If an object is accessed by multiple threads, but the access times are staggered (that is, there is no competition, and if there is competition, the lightweight lock will be upgraded to a heavyweight lock), then you can Use lightweight locks to optimize. Lightweight locks are transparent to users, that is, the syntax is still synchronized, such as the following case:

static final Object obj=new Object();
public static void method1(){<!-- -->
synchronized(obj){<!-- -->
//Synchronized block A
method2();
}
}

public static void method2(){<!-- -->
   synchronized(obj){<!-- -->
       //Synchronized block B
   }
}

Create a lock record (Lock Record) object. Each scene stack frame will contain a lock record structure, which can store the Mark Word of the lock object internally.
Let the Object Reference in the lock record point to the lock object, and try to use the lock record to exchange (CAS) the Mark Word of the Object (if the exchange is successful, does it mean the lock is successful? If it is 01, it means it can be exchanged), and store the value of the Mark Word in the lock. Record

CAS is an optimistic locking mechanism, also known as a lock-free mechanism. Full name: Compare-And-Swap. It is an atomic operation in concurrent programming and is usually used to achieve synchronization and thread safety in multi-threaded environments. The CAS operation determines whether to perform a swap operation by comparing the value in memory to the expected value. If they are equal, the swap operation is performed, otherwise it is not performed.

4. If the exchange is successful, the lock record address and status 00 are stored in the object header, indicating that the thread locks the object.

5. If the exchange fails (CAS fails), there are two situations:

If other threads already hold the lightweight lock of the Object, it indicates that there is competition and the lock expansion process is entered.
If you perform synchronized lock reentry yourself, add another Lock Record as the reentrancy count.

When exiting the synchronized code block (when unlocking), if there is a lock record with a value of null, it means there is reentrancy. At this time, the lock record is reset, which means the reentrancy count is reduced by 1.

7. When exiting the synchronized code block (unlocking) the value of the lock record is not null. At this time, use CAS to restore the value of Mark Word to the object header.

Success: Unlocked successfully
Failure: It means that the lightweight lock has undergone lock expansion or has been upgraded to a heavyweight lock, and the heavyweight lock unlocking process has entered.

(2) Lock expansion

If the CAS operation fails to succeed during the process of trying a lightweight lock, one situation is that other threads have added a lightweight lock to this object (there is competition). At this time, lock expansion needs to be performed to expand the lightweight lock. The level lock has become a heavyweight lock.

static Object obj=new Object()
public static void method1(){<!-- -->

}

When Thread-1 performs lightweight locking, Thread-0 has already added a lightweight lock to the object.

2. At this time, Thread-1 fails to add a lightweight lock and enters the lock expansion process.

That is, apply for a Monitor lock for the Object object and let the Object point to the heavyweight lock address.
Then enter the Monitor’s EntryList queue to self-block.

When Thread-0 exits the synchronized code block (it still holds the lightweight lock at this time), it uses the cas operation to restore the value of Mark Word to the object header, but fails. At this point, the heavyweight unlocking process will be entered, that is, find the monitor object according to the monitor address, set the Owner to null, and wake up the blocked thread in the EntryList.

3. Optimization plan two: spin optimization

When heavyweight lock competition occurs, spin can also be used for optimization. If the current thread spins successfully (that is, the scene holding the lock has exited the synchronization block and released the lock), the current scene can avoid blocking at this time ( As we have seen earlier, when a site applies for a Monitor lock, if it finds that the owner is not empty, it will enter the EntryList to block. The purpose of spin optimization is that if the thread finds that the owner is not empty, it will perform a certain number of self-executions in place. Spin without directly entering the EntryList to block. If the Monitor lock is released during spin, the spinning thread can obtain the Monitor lock, thus avoiding blocking – that is, avoiding context switching).
Successful spin retry

Situation when spin retry fails

After Java 6, spin is adaptive. For example, after the object’s previous spin operation is successful, it thinks that the probability of the spin being successful will be high, and it will spin more times. Otherwise, it will spin less or even less. It’s not spinning.
Spin will take up CPU time. Single-core CPU spin is a waste. Multi-core CPU spin can take advantage of it.
After Java7, it is not possible to control whether to enable spin operation

4. Optimization plan three: biased lock

When there is no competition for lightweight locks, CAS operations still need to be performed for each reentry. Java 6 introduced biased locking for further optimization: CAS is used only once to set the thread ID to the Mark Word header of the object header. If later it is found that the thread ID is your own, it means there is no competition and CAS will not be re-CAS. As long as no competition occurs in the future, this object will be owned by the thread.

(1) Biased state

Recall the object header format:

When an object is created:

If bias locking is enabled (enabled by default), then after the object is created, the markword is 0x05, that is, the last three digits are 101, and its thread, epoch, and age default to 0.
Biased locking is delayed by default and will not take effect immediately when the program starts. If you want to avoid the delay, you can use the VM parameter -XX:BiasedLockingStartupDelay=0 to disable the delay.
If the bias lock is not turned on, then after the object is created, the Markword value is 0x01, that is, the last three digits are 001. This means that its hashcode and age are both 0, and the hashcode will not be assigned until the first time it is used.

Note:

Hashcode will disable the bias lock of an object. This is because after hashcode is called, there is no extra location to store the thread ID in the object header, so the bias lock will become invalid.

When other threads use the biased lock object, the biased lock will be upgraded to a lightweight lock.

Calling wait/notify will invalidate the biased lock (because this mechanism is only available in heavyweight locks, the biased lock will be upgraded to a heavyweight lock)

(2) Batch heavy bias

If the object is accessed by multiple threads, but there is no competition (that is, a time-staggered access), then the object that is biased toward site 1 will still have a chance to bias toward thread 2, and the heavy bias will reset the thread ID of the object. When the bias lock failure threshold exceeds 20 times, the jvm will think that I have made the wrong bias, so it will redirect these objects to the locking thread when locking them (this is an optimization of bias lock failure).
Case scenario

1. Create a user class
2. Initialize a collection
3. Thread t0 adds 30 user objects to this collection in a loop, and then locks each object separately. At this time, the information in the header of each user object will be biased towards the current thread t0.
4. Then thread t2 locks the 30 objects in the collection again. It will find that for the first 20 objects, the bias lock will be revoked and lightweight locks will be used. Then the jvm performed batch redirection for the next 10 objects, so the objects of the user class were redirected to t2.

(3) Batch re-direction cancellation

When the bias lock fails more than 40 times (indicating that many sites will access the object), the jvm will feel that it is indeed biased wrong and should not be biased at all. As a result, the entire class object will become non-biasable, and the newly created objects will also be non-biasable.
Case scenario

1. Create a user class
2. Initialize a collection
3. Thread t0 adds 40 user objects to this collection in a loop, and then locks each object separately. At this time, the information in the header of each user object will be biased towards the current thread t0.
4. Then thread t2 locks the 30 objects in the collection again, and will find the first 19 objects. The biased lock will be revoked and lightweight locks will be used. Then the jvm performed batch redirection for the next 11 objects, so the objects of the user class were redirected to t2.
5. Then thread t3 re-locks these 40 objects, and you will find that the first 19 objects are still in the re-biased state because t2 has revoked the re-bias, and then there is a batch re-bias revocation, and from 20 The object at the beginning of the object is biased towards the t2 thread, so t3 will also perform batch undo redirection operations. Until the 40th object, there have been 39 undo operations, so all future objects of the user class will be set to non-redirection.

5. Optimization plan four: lock elimination

First we use JMH to conduct a benchmark test on the following code:

Create a maven project, import the relevant jar package, and write the following code

//How many rounds of tests are done in total?
@Fork(1)
//Adopt throughput mode
@BenchmarkMode(Mode.AverageTime)
//The number of times to perform preheating
@Warmup(iterations = 3)
//The number of formal tests
@Measurement(iterations = 5)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class jvmtestMain {<!-- -->
    static int x=0;
    @Benchmark
    public void a()throws Exception {<!-- -->
        x + + ;
    }
    @Benchmark
    public void b() throws Exception{<!-- -->
        Object o=new Object();
        synchronized (o){<!-- -->
            x + + ;
        }
    }
}

Package and run

 java -jar benchmarks.jar

The final result found that the performance (score) of locked b and unlocked a are almost the same (it stands to reason that locking will have a great impact on program performance). This is because of the existence of JIT, which will affect our bytecode After further optimization, JIT will find that local variable o will not escape the scope of method b (escape analysis), that is, it is thread private and will not cause concurrency security issues, so JIT will cancel the lock on variable o. The optimization behavior of JIT is called lock elimination. We can use -XX: -Eliminatelocks to turn off the JVM for lock elimination optimization.