List, Set, ConcurrentHasMap of java concurrent programming

1, ArrayList

Features: fast query speed, automatic expansion, elements are placed in order, and elements can be repeated
Storage structure: array
Expansion principle and expansion size
- ArrayList will be expanded to 1.5 times the original capacity
- When ArrayList calls the add() method, it will judge whether it needs to be expanded, and when it needs to be expanded, it will directly create a new array, and directly copy the original array to the new array to improve efficiency. And in terms of expansion, the calculation method of 1.7 and 1.8 has also changed to a certain extent
  - 1.7: (oldLen*3)/2 + 1
  - 1.8: oleLen shifted one bit to the right faster

2, LinkedList

Features: fast addition and deletion
Storage structure: linked list

3, HashMap

Features: automatic expansion, key, value storage, key can be null, the same key will be overwritten
storage structure
- JDK1.7: array + linked list
- JDK1.8: array + linked list + red-black tree
Related implementation principles:
- We know that simple program = data structure + algorithm
- The algorithm used by hashmap is called hash table, which can transform any length value (Key) into a fixed-length key (address) through a hash algorithm
  
  The data structure accessed through this address accesses the record by mapping the key value to a location in the table to speed up the lookup.
- Hashcode used by HashMap: Calculate its ascii code through the string, perform mod (modulo), and calculate the subscript in the hash table.
- But when multiple keys perform hash calculation and take the modulus, you will find that their values after taking the modulus are the same. This is hash conflict, how to solve it?
  - In fact, the reason why linked lists and arrays are used to store data in HashMap is also here. The purpose of using an array is to put the data in an array after taking the hash modulus, but after taking the modulus of multiple data, it will appear that they will be placed in the same position of the array, and then there will be a collision, and the reference linked list will be resolved. To solve this problem, you only need to judge whether there is data there, if there is, it will be stored according to the linked list, if not, it will be stored directly.
- But when a lot of data is stored in the same subscript after hash modulo, although there is a linked list to solve this problem, when the amount of data increases, the query speed of the linked list will become very low. How to solve it Woolen cloth? So in Jdk1.8, a red-black tree was introduced in hashMap to solve this problem. When the length of the linked list reaches the conversion threshold of 8, and the length of the array reaches 64, the linked list will be converted to strong>Red-black tree improves query speed. When the data is removed and it is judged to be equal to the degradation threshold (6), the red-black tree will be converted into a linked list
  - Why is there a 7 between upgrading to a tree and descending to a linked list?
    - Because there is a direct difference between the two to avoid frequent conversion between the tree and the linked list, imagine that when there is frequent data insertion and removal, there is no difference between the two conversion thresholds, and there will be One will be converted to a tree and the other will be converted to a linked list
Thinking about related issues
- 3.1. What will HashMap do when putting data?
  - When we put data, we will first judge whether the data is empty. If it is empty, the array will be initialized, and then the hash calculation will be performed on the key to determine the value it puts. location, if there is no node at the location, will add a new node, and then place it at the location of the previously calculated index, and then judge whether it exceeds the expansion threshold. If it exceeds, it will Expand the capacity, if not, insert it directly, if there is a node in the position to be placed, it will judge whether the key of the head node is the same as the one inserted, < strong>If they are the same, will directly overwrite the original value, and return the value before overwriting, if not, it will judge whether the head node is a red and black number head Node, if it is**, it will find the root node of the red-black tree and traverse it to judge whether there is the samekey**, if there is, it will be overwritten directly, if not , it will create a new node, put it in the position corresponding to the red and black numbers, then adjust the insertion balance of the red and black tree, and then judge whether it exceeds the threshold, expand the capacity if it exceeds, and not expand if it does not exceed the threshold. If it is not a red-black tree, it will traverse the linked list and count the number of its nodes at the same time, will judge whether there is a node with the same key, If there is something, it will overwrite**, if not, it will insert it into the end of the linked list, and then judge whether the number of its nodes exceeds 8**, if not, it will judge whether it exceeds the threshold, If it exceeds, expand the capacity, if not, it will not expand. If exceeds tree threshold-1(7), and the length of the array exceeds 64, it will go to It is transformed into a red-black tree, and then it will also judge whether the threshold is exceeded, and the capacity will not be expanded if it is not exceeded.

4, HashSet

Features: It is implemented based on HashMap, the constructor directly creates a new HashMap, the value of hashSet is stored on the key of HashMap, and the value of hashmap is unified as Present. Basically, it is implemented by directly calling the underlying method of HashMap.

The value of HashSet does not allow duplicates

HashSet uses member objects to calculate the hashcode value. For two objects, the hashcode may be the same, so the equals() method is used to judge the equality of the objects. If the two objects are different, then return false.

5, HashTable

Similarities and differences between HashTable and HashMap

Both HashMap and HashTable implement the map interface

HashTable is thread-safe and locked by synchronized, but this will cause its efficiency to be slow, and hashmap thread is not safe

HashTable does not allow key-value pairs to be empty, and hashmapHashMap, HashMap will put the null key directly in the first position

6. ConcurrentHashMap

Features: concurrent security, automatic expansion

Storage features: The bottom layer uses arrays, linked lists, and red-black trees, and a large number of CAS operations are used internally. Concurrency control uses?synchronized and

CAS to operate to achieve.

currentHashMap array initialization

The initialization uses CAS + spin -> thread-safe thread guarantee

1. First judge that the previous sizeCtl is not less than 0, if yes, there are other threads being initialized,

2. Then, a U.compareAndSwapInt(this, SIZRCTL, sc, -1) will be performed to make a judgment, modify the current sizeCtl value to -1, and if the modification is successful, the array will be initialized. When initializing, There is an operation to calculate the threshold, which is n-(n>>>2), which is equivalent to 0.75. If it fails, it will spin and wait (looping forever). Finally, the calculated expansion threshold will be assigned to sizeCtl. —-->Calculate the expansion threshold->It’s wonderful, it isn-(n>>>2), which is equivalent to 0.75

The meaning of the Sizectl variable

?

SizeCtl value meaning

0 Indicates that the array is not initialized, and the initial capacity of the array is 16

1 If the array is not initialized, then it records the initial capacity of the array, If the array has been initialized, the record is the expansion threshold of the array

-1 Indicates that the array is being initialized

is less than 0 and not -1 means that the array is expanding, -(1 + n) means that there are n threads working together to complete the array expansion operation

currentHashMap expansion principle?

When expanding capacity, it will first judge whether the number of cups is greater than 1. If yes, it will divide tasks for each thread ( move the length of the array to the right by 3 digits and divide by the number of CPUs, otherwise it will be directly equal to n-ternary operator), and then judge whether the current thread is an expansion thread, if so, expand the new array by twice, record the bucket position where the thread starts to migrate, and then start migration from the back to the front. A fwd node will be used to occupy the bucket position that has been migrated, indicating that it has been migrated. The hash value in this node is moved-> -1. If there is no migration, it will lock, lock the bucket with a code block (synchronized (f) {}), and perform data migration (the specific migration is similar to hashMap) Threads will only be responsible for their own Migration tasks for

The operation of multithreading to assist expansion will be triggered in two places

1. When adding an element, if it is found that the bucket corresponding to the added element is a fwd node, it will help expand the capacity, and then add the element

2. After adding elements, judge whether the current element has reached the expansion threshold. At this time, the value of the method sizeCtl is less than 0, and when the new array is not empty, it will assist in expansion

The principle of currentHashMap Put

Why is it not supported that the key or value is null?

The value is null, because concurrentHashMap is multi-threaded. When the value is obtained by get (key) and null is obtained, it is impossible to judge whether the map is null or the corresponding key is not found. It is null. Under the single-threaded hashmap, you can use containskey to judge whether it is then Contains this null. Also, if ConcurrentHashMap is allowed to store a value with a value of null, there are two threads at this time, and the T1 thread calls get (key) to return null. We don’t know whether this null is not mapped to null or it is originally null. If this When we call ContaintsKey to verify, we expect to get false, but if another thread T2 executes put (key, null) between calling get and ContainsKey, then we call ContainsKey What is returned is true, that is, it does not meet the assumption. ambiguity

As for why the key cannot be null, I guess that the author Doug Lee doesn’t like null. It seems that it is written in the comment that it is difficult to check the empty key and value in a concurrent situation. The key is not allowed to be null at the beginning of the design. Couldn’t find an official explanation

Does the get method of ConcurrentHashMap need to be locked?

The get method does not need to be locked, because both the element val and the pointer next of Node are decorated with volatile. Under multi-threading conditions, when thread T1 modifies the node’s val or adds a new node, multi-thread T2 is visible of.

Does the Get method not need to be locked related to the hash bucket modified by volatile?

It doesn’t matter, the hash bucket table is decorated with volatile mainly to ensure the visibility when the array is expanded.