java synchronized keyword

Bytecode level

0x0020 [synchronized] will be added to the method flag.

Specifically used are monitorenter and monitorexit

0 load_0

1 dup

2 astore_1

3 monitorenter

4 aload_1

5 monitorexit

6 goto 14 (+ 8)

9 astore_2

10 aload_1

11 monitorexit

12 aload_2

13 throw

14 return

Why are there two monitorexits? In addition to normal exit, they will also exit after detecting an exception.

public class V9_Synchronized {
    void m(){
        synchronized (this){//monitorenter

        }//monitorexit
    }
}

Operating system implementation

X86: lock cmpxchg/xxx

This is also implemented using lock, but there will be questions. Since the underlying layers of synchronized and volatile are both locks, why can’t synchronized guarantee orderliness?

In fact, synchronized can guarantee orderliness, but it is only guaranteed when competing for object lock resources. Once the code is executed inside the method, the orderliness of these codes cannot be guaranteed.

Lock upgrade process

Bias lock:

Record the ID of the thread that currently acquires the lock in the object header of the lock object. If the thread acquires the lock again next time, it can acquire it directly.

Lightweight lock (spin lock, no lock)

Upgraded from the biased lock. When a thread acquires the lock, the lock is a biased lock. If a second thread competes for the lock, the biased lock will be revoked (this step is also very resource-intensive) , upgraded to a lightweight lock. The reason why it is called a lightweight lock is to separate it from the heavyweight lock. The bottom layer of the lightweight lock is implemented through spin and will not block the thread. If the number of spins is too many, it will still If the lock is not acquired, it will be upgraded to a heavyweight lock. Heavyweight locks will cause thread blocking.

The process of spin lock competition:

Each thread has its own thread stack. Each thread generates an LR (Lock Record) inside its own thread stack. The thread will use spin to put this LR into the mark word of the lock object, which means contention. After grabbing the lock, other threads continue to perform CAS spin.

The spin lock means that the thread will not block the thread during the process of acquiring the lock, so it does not matter whether to wake up the thread. Both blocking and waking up require the operating system to perform these two steps, which are more time-consuming. The spin lock is when the thread passes the lock. CAS obtains an expected mark. If it does not obtain it, it continues to obtain it in a loop. If it obtains it, it means that it has obtained the lock. This process thread is always running. Relatively speaking, it does not use too many operating system resources and is relatively lightweight.

Heavyweight lock

If competition intensifies: there are threads that spin more than 10 times, -XX:PreBlockSpin, or the number of spin threads exceeds half of the number of CPU cores, Adaptive Self Spinning was added after 1.6, and the JVM controls it by itself.

Upgrade the heavyweight lock, apply for resources from the operating system, Linux mutex, the thread hangs, enters the waiting queue, waits for the scheduling of the operating system, and then maps back to the user space.

identity hash code

As shown above, after an object is newly created, if the identity hash code is called, then there are 31 bits in the mark word to record the hashCode. Then if it is upgraded to a lightweight lock, the mark word The pointer to LR in the thread stack will be recorded, so where does this identity hashcode go?

It will be placed in its own thread stack. There is an LR in the thread stack, and LR points to a space (Displaced Mark Word). This space records the mark word before the lock upgrade and is used for backup.

What about biased locks?

When an object has calculated the identity hash code, it cannot enter the biased lock state; when an object is currently in the biased lock state and its identity hash code needs to be calculated, its biased lock will be revoked and the lock will < strong>Inflated into a heavyweight lock; then when will the object calculate the identity hash code? Of course this is when you call the uncovered Object.hashCode() method or System.identityHashCode(Object o).

Implementation of heavyweight lock

There is a field in the ObjectMonitor class that can record the mark word in the non-locked state, in which the value of the identity hash code can be stored. Or simply put, the weight lock can store the identity hash code.

Under what circumstances will a spin lock be upgraded to a heavyweight lock?

Why do we need heavyweight locks when we have spin locks?

Spin consumes CPU resources. If the lock time is long or there are many spin threads, the CPU will be consumed heavily.

After upgrading to a heavyweight lock, those threads that have not obtained the lock will be placed in a lock pool to wait (without consuming CPU resources). When the previous thread releases the synchronization lock, the threads in the lock pool will compete for the synchronization lock. After a thread is obtained, it will enter the ready queue to wait for CPU resource allocation, so when competition is fierce, heavyweight locks will be more suitable than spin locks.

Is biased locking necessarily more efficient than spin?

Not necessarily. When it is clear that there will be multi-thread competition, the biased lock will definitely involve lock revocation. At this time, it is not as efficient as using the spin lock directly.

For example, during the JVM startup process, there will be multi-thread competition (clearly), so the bias lock is not turned on by default at startup, and will be turned on after a while. The default is 4 seconds. -XX:BiasedLockingStartupDelay=4

When the bias lock is not activated and the object is locked at this time, you can see that the mark word is 001, indicating no lock state.

public class V24_TestJol {
    public static void main(String[] args) {
        Object o = new Object();
        System.out.println(ClassLayout.parseInstance(o).toPrintable());
    }
}
java.lang.Object object internals:
 OFFSET SIZE TYPE DESCRIPTION VALUE
      0 4 (object header) 01 00 00 00 (00000001 00000000 00000000 00000000) (1)
      4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8 4 (object header) e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12 4 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

When the bias lock is activated, you can see that the mark word of the object has changed to 101, and 101 is the bias lock. There will be questions here. Why does the newly created object have a bias lock?

We know that the biased lock will have a thread pointer, but no thread pointers are recorded here, all are 0, so this lock is called an anonymous biased lock.

public class V24_TestJol {
    public static void main(String[] args) throws InterruptedException {
        Thread.sleep(5000);
        Object o = new Object();
        System.out.println(ClassLayout.parseInstance(o).toPrintable());
    }
}
java.lang.Object object internals:
 OFFSET SIZE TYPE DESCRIPTION VALUE
      0 4 (object header) 05 00 00 00 (00000101 00000000 00000000 00000000) (5)
      4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
      8 4 (object header) e5 01 00 f8 (11100101 00000001 00000000 11111000) (-134217243)
     12 4 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

Anonymous bias lock is upgraded to bias lock:

public class V24_TestJol {
    public static void main(String[] args) throws InterruptedException {
        Thread.sleep(5000);
        Object o = new Object();
        System.out.println(ClassLayout.parseInstance(o).toPrintable());
        synchronized (o){
            System.out.println(ClassLayout.parseInstance(o).toPrintable());
        }
    }
}

CAS (compareAndSwap/compareAndSet) optimistic lock

Underlying implementation

The bottom layer of incrementAndGet is an assembly instruction (cmpxchg) executed. CAS operations are directly supported at the assembly level of the CPU.

LOCK_IF_MP

But this command itself is not atomic. For example, there is no guarantee when operating a certain variable between multiple CPUs. Therefore, an instruction LOCK_IF_MP is added in front of this instruction. Here you can see that there is a LOCK_IF_MP. If it is a multi-core processor, add the lock prefix before the instruction, because in a single-core processor, there is no cache inconsistency problem. All threads run on one CPU core and use the same cache area. There is no problem of inconsistency between local memory and main memory, and it does not cause visibility problems. However, in multi-core processors, the shared memory needs to be flushed from the write cache to the main memory and follows the cache coherence protocol to notify other processors to update the cache.

The role of lock

During the execution of cmpxchg, the memory address [edx] is locked and other processors cannot access the memory to ensure atomicity.
Forcibly write all the write cache on this processor back to the main memory, which is a write barrier to ensure that the local memory of each thread is consistent with the main memory
Prohibit cmpxchg from reordering any instructions before or after it to prevent reordering of instructions

Disadvantages of CAS

ABA problem

The ABA problem means that during the CAS operation, other threads changed the variable value A to B, but then changed it back to A. When this thread uses the expected value A to compare with the current variable, it is found that the variable A has not changed, so CAS will The A value has been exchanged, but in fact the value has been changed by other threads, which is inconsistent with the design idea of optimistic locking.

The solution to the ABA problem is to add 1 to the version number of the variable every time the variable is updated, then A-B-A will become A1-B2-A3. As long as the variable is modified by a certain thread, the version number corresponding to the change will occur. Incremental changes, thus solving the ABA problem. AtomicStampedReference is provided in the JDK’s java.util.concurrent.atomic package to solve the ABA problem. compareAndSet of this class is the core method of this class, and is implemented as follows:

public boolean compareAndSet(V expectedReference,
                             V newReference,
                             int expectedStamp,
                             int newStamp) {
    Pair<V> current = pair;
    return
        expectedReference == current.reference & amp; & amp;
        expectedStamp == current.stamp & amp; & amp;
        ((newReference == current.reference & amp; & amp;
          newStamp == current.stamp) ||
         casPair(current, Pair.of(newReference, newStamp)));
}

We can find that this class checks whether the current reference and the current flag are the same as expected. If they are all equal, the value of the reference and the flag will be atomically set to the new updated value, so that the comparison in the CAS operation will not Depends on the value of the variable.

CAS causes spin consumption

When multiple threads compete for the same resource, if the spin is unsuccessful, the CPU will always be occupied.

Solution: Destroy the for infinite loop, and return to exit when it exceeds a certain time or a certain number of times. JDK8’s new LongAddr is a method similar to ConcurrentHashMap. When multiple threads compete, the granularity is reduced and the resources are divided into multiple threads, which reduces competition pressure and reduces CPU idling and spin time.

What is the difference between LongAddr and AtomicLong?

AtomicLong is equivalent to multiple threads competing for the opportunity to modify the value once. LongAddr splits the value into multiple values and puts them into the cell array, which is equivalent to multiple threads competing for the opportunity to modify the value multiple times, and the performance naturally increases.

The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. Java skill treeBasic syntax of JavaKeywords in Java 139134 people are learning the system