Get rid of Random and get random numbers

Recently, when writing some business code, I encountered a scenario where random numbers need to be generated. At this time, I naturally thought of the Random class in the jdk package. But out of the ultimate pursuit of performance, consider using the ThreadLocalRandom class for optimization. In the process of viewing the implementation of ThreadLocalRandom, I chased some codes of Unsafe, and the whole process After coming down, I learned a lot, and solved a lot of doubts through searching and asking questions, so I summarized it in this article.

Random performance issues

When using the Random class, in order to avoid the overhead of repeated creation, we generally set the instantiated Random object as the attribute or static attribute of the service object we use. This is no problem when the thread competition is not fierce, but in a high In concurrent web services, using the same Random object may cause thread blocking.

The random principle of Random is to perform fixed arithmetic and bit operations on a “random seed” to obtain a random result, and then use this result as the next random seed. When solving the problem of thread safety, Random uses CAS to update the next random seed. It is conceivable that if multiple threads use this object at the same time, there will definitely be some threads that fail to execute CAS consecutively, which will lead to thread blocking.

ThreadLocalRandom

The developers of jdk naturally considered this problem and added the ThreadLocalRandom class in the concurrent package. The first time I saw this class name, I thought it was implemented through ThreadLocal, and then I thought of a terrible memory leak. problem, but when you click into the source code, there is no shadow of ThreadLocal, but there are a lot of Unsafe-related codes.

Specific use

The specific usage of ThreadLocalRandom is as follows:

  1. get instance

    ThreadLocalRandom random = ThreadLocalRandom.current(); // Get the ThreadLocalRandom instance of the current thread
    
  2. random integer

    // Get a random integer from 0 to 9
    int i1 = ThreadLocalRandom.current().nextInt(10);
    
    // Get a random integer from 10 to 19
    int i2 = ThreadLocalRandom.current().nextInt(10, 20);
    
  3. random long integer

    // Get a random long integer from 0 to 999999999
    long l1 = ThreadLocalRandom.current().nextLong(1000000000);
    
    // Get a random long integer from 1000000000 to 1999999999
    long l2 = ThreadLocalRandom.current().nextLong(1000000000, 2000000000);
    
  4. random boolean

    boolean b = ThreadLocalRandom.current().nextBoolean(); // return true or false randomly
    
  5. random floating point number

    // Get a random floating point number from 0.0 to 1.0
    float f1 = ThreadLocalRandom.current().nextFloat();
    
    // Get a random floating point number from 0.0 to 9.0
    float f2 = ThreadLocalRandom.current().nextFloat(10.0f);
    
    // Get a random floating point number from 10.0 to 19.0
    float f3 = ThreadLocalRandom.current().nextFloat(10.0f, 20.0f);
    
    // Get a random floating point number between 0.0 and 1.0
    double d1 = ThreadLocalRandom.current().nextDouble();
    
    // Get a random floating point number from 0.0 to 9.0
    double d2 = ThreadLocalRandom.current().nextDouble(10.0);
    
    // Get a random floating point number from 10.0 to 19.0
    double d3 = ThreadLocalRandom.current().nextDouble(10.0, 20.0);
    
  6. random byte array

    byte[] bytes = new byte[10];
    ThreadLocalRandom.current().nextBytes(bytes); // Fill the bytes array with a random sequence of bytes
    

Core code

UNSAFE.putLong(t = Thread.currentThread(), SEED, r = UNSAFE.getLong(t, SEED) + GAMMA);

Translated into more intuitive Java code is like:

Thread t = Thread. currentThread();
long r = UNSAFE. getLong(t, SEED) + GAMMA;
UNSAFE. putLong(t, SEED, r);

It looks very familiar, just like we usually get/set in the Map, the key in the current object is obtained by Thread.currentThread(), and the SEED random seed is used as the value.

However, using an object as a key may cause a memory leak. Since a large number of Thread objects may be created, if the value in the Map is not removed during recycling, the Map will become larger and larger, and finally the memory will overflow.

Unsafe

Features

But look carefully at the core code of the ThreadLocalRandom class, and find that it is not a simple Map operation. Its getLong() method needs to pass in two parameters, and the putLong() method needs three parameters. Check the source code and find that they are all native methods. We can’t see the concrete implementation. The two method signatures are:

public native long getLong(Object var1, long var2);
public native void putLong(Object var1, long var2, long var4);

Although we can’t see the specific implementation, we can check their functions. The following is the function introduction of the two methods:

  • putLong(object, offset, value) can set the last four bytes after the offset of the object memory address as value.

  • getLong(object, offset) reads four bytes from the offset offset of the object memory address and returns it as a long.

Insecure

As a method in the Unsafe class, it also reveals an “Unsafe” atmosphere. The specific performance is that it can directly operate the memory without any security checks. If there is a problem, it will throw Fatal at runtime Error, leading to the exit of the entire virtual machine.

In our common sense, the get method is the easiest place to throw exceptions, such as null pointers, type conversions, etc., but the Unsafe.getLong() method is a very safe method, which reads four bytes from a certain memory location , regardless of the content of these four bytes, it can always be successfully converted into long type. Whether the result of this long type matches the business is another matter. The set method is also relatively safe. It overwrites the four bytes after a certain memory location into a long value, and it is almost error-free.

So where are these two methods “unsafe”?

Their insecurity is not that an error is reported during the execution of these two methods, but that changing the memory without protection will cause other methods to report an error when using this section of memory.

public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException {
    // Unsafe sets the private method of the constructor, getUnsafe gets the instance method package private, which can only be obtained through reflection outside the package
    Field field = Unsafe. class. getDeclaredField("theUnsafe");
    field.setAccessible(true);
    Unsafe unsafe = (Unsafe) field. get(null);
    // The Test class is a handwritten test class with only one String type test class
    Test test = new Test();
    test.ttt = "12345";
    unsafe. putLong(test, 12L, 2333L);
    System.out.println(test.value);
}

Running the above code will get a fatal error, and the error message is “A fatal error has been detected by the Java Runtime Environment: … Process finished with exit code 134 (interrupted by signal 6: SIGABRT)”.

You can see from the error message that the virtual machine exited because of this fatal error abort. The reason is very simple. I use unsafe to set the position of the value attribute of the Test class to a long value of 2333. When I use the value attribute, the virtual machine Parsing this piece of memory into a String object, the structure of the original String object object header is disrupted, and an error is thrown when parsing the object fails. The more serious problem is that there is no information such as the class name and line number in the error message, so check in complex projects This question is like looking for a needle in a haystack.

However, other methods of Unsafe are not necessarily the same as this pair of methods. You may need to pay attention to other security issues when using them, and we will talk about them later.

Implementation of ThreadLocalRandom

So is ThreadLocalRandom safe? Let’s go back and look at its implementation.

The implementation of ThreadLocalRandom requires the cooperation of the Thread object. There is an attribute threadLocalRandomSeed in the Thread object, which stores the thread-specific random seed, and this attribute is loaded in the ThreadLocalRandom class at the offset of the Thread object. The specific method is SEED = UNSAFE.objectFieldOffset(Thread.class.getDeclaredField("threadLocalRandomSeed"));

We know that the memory size occupied by an object is determined after the class is loaded, so use Unsafe.objectFieldOffset(class, fieldName) to get the offset of a certain attribute in the class, and When the offset is found and the data type can be determined, it is safe to use ThreadLocalRandom.

Questions

In the process of finding these problems, I also had two doubts.

Usage scenario

First of all, why does ThreadLocalRandom have to use Unsafe to modify the random seed in the Thread object? Isn’t it more convenient to add get/set methods in the Thread object?

Someone on stackOverFlow had the same question as me, why is threadlocalrandom implemented so bizarre, the accepted answer explained that for jdk developers, Unsafe and get/set methods are like ordinary tools, and there is no specific guideline for which one to use . This answer did not convince me, so I opened another question. I agree with a comment in it, to the effect that ThreadLocalRandom and Thread are not in the same package. If you add a get/set method, the get/set method must be set to public , which violates the principle of closure of the class.

Memory layout

Another question is that after I saw that Unsafe.objectFieldOffset can get the offset of the attribute in the object memory, I used the main method in IDEA to try the Test class mentioned above, and found that the only attribute value of the Test class is relative to the object The offset of the memory is 12, so I am more confused about the composition of these 12 bytes.

We know that the object header of a Java object is placed at the beginning of the memory of the Java object, and the MarkWord of an object is at the beginning of the object header. In a 32-bit system, it occupies 4 bytes, while in a 64-bit system It takes 8 bytes on my system, and I’m using a 64-bit system, which will undoubtedly take an 8-byte offset.

Following the MarkWord should be the class pointer of the Test class and the length of the array object. The length of the array is 4 bytes, but the Test class is not an array and has no other attributes. The data length can be excluded, but the pointer should also be in a 64-bit system. 8 bytes, why only 4 bytes are used?

The only possibility is that the virtual machine has enabled pointer compression. Pointer compression can only be enabled in 64-bit systems. After enabling it, the pointer type only needs to occupy 4 bytes, but I have not specified the use of pointer compression. After checking, it turns out that pointer compression is enabled by default after 1.8. After using the -XX:-UseCompressedOops parameter when enabling it, the offset of value becomes 16.