List, Set, ConcurrentHasMap of java concurrent programming

List, Set, ConcurrentHasMap of java concurrent programming

1, ArrayList

  • Features: fast query speed, automatic expansion, elements are placed in order, and elements can be repeated
  • Storage structure: array
  • Expansion principle and expansion size
    • ArrayList will be expanded to 1.5 times the original capacity
    • When ArrayList calls the add() method, it will judge whether it needs to be expanded, and when it needs to be expanded, it will directly create a new array, and directly copy the original array to the new array to improve efficiency. And in terms of expansion, the calculation method of 1.7 and 1.8 has also changed to a certain extent
      • 1.7: (oldLen*3)/2 + 1
      • 1.8: oleLen shifted one bit to the right faster

2, LinkedList

  • Features: fast addition and deletion
  • Storage structure: linked list

3, HashMap

  • Features: automatic expansion, key, value storage, key can be null, the same key will be overwritten

  • storage structure

    • JDK1.7: array + linked list
    • JDK1.8: array + linked list + red-black tree
  • Related implementation principles:

    • We know that simple program = data structure + algorithm

    • The algorithm used by hashmap is called hash table, which can transform any length value (Key) into a fixed-length key (address) through a hash algorithm

      The data structure accessed through this address accesses the record by mapping the key value to a location in the table to speed up the lookup.

    • Hashcode used by HashMap: Calculate its ascii code through the string, perform mod (modulo), and calculate the subscript in the hash table.

    • But when multiple keys perform hash calculation and take the modulus, you will find that their values after taking the modulus are the same. This is hash conflict, how to solve it?

      • In fact, the reason why linked lists and arrays are used to store data in HashMap is also here. The purpose of using an array is to put the data in an array after taking the hash modulus, but after taking the modulus of multiple data, it will appear that they will be placed in the same position of the array, and then there will be a collision, and the reference linked list will be resolved. To solve this problem, you only need to judge whether there is data there, if there is, it will be stored according to the linked list, if not, it will be stored directly.
    • But when a lot of data is stored in the same subscript after hash modulo, although there is a linked list to solve this problem, when the amount of data increases, the query speed of the linked list will become very low. How to solve it Woolen cloth? So in Jdk1.8, a red-black tree was introduced in hashMap to solve this problem. When the length of the linked list reaches the conversion threshold of 8, and the length of the array reaches 64, the linked list will be converted to strong>Red-black tree improves query speed. When the data is removed and it is judged to be equal to the degradation threshold (6), the red-black tree will be converted into a linked list

      • Why is there a 7 between upgrading to a tree and descending to a linked list?
        • Because there is a direct difference between the two to avoid frequent conversion between the tree and the linked list, imagine that when there is frequent data insertion and removal, there is no difference between the two conversion thresholds, and there will be One will be converted to a tree and the other will be converted to a linked list
  • Thinking about related issues

    • 3.1. What will HashMap do when putting data?

      • When we put data, we will first judge whether the data is empty. If it is empty, the array will be initialized, and then the hash calculation will be performed on the key to determine the value it puts. location, if there is no node at the location, will add a new node, and then place it at the location of the previously calculated index, and then judge whether it exceeds the expansion threshold. If it exceeds, it will Expand the capacity, if not, insert it directly, if there is a node in the position to be placed, it will judge whether the key of the head node is the same as the one inserted, < strong>If they are the same, will directly overwrite the original value, and return the value before overwriting, if not, it will judge whether the head node is a red and black number head Node, if it is**, it will find the root node of the red-black tree and traverse it to judge whether there is the samekey**, if there is, it will be overwritten directly, if not , it will create a new node, put it in the position corresponding to the red and black numbers, then adjust the insertion balance of the red and black tree, and then judge whether it exceeds the threshold, expand the capacity if it exceeds, and not expand if it does not exceed the threshold. If it is not a red-black tree, it will traverse the linked list and count the number of its nodes at the same time, will judge whether there is a node with the same key, If there is something, it will overwrite**, if not, it will insert it into the end of the linked list, and then judge whether the number of its nodes exceeds 8**, if not, it will judge whether it exceeds the threshold, If it exceeds, expand the capacity, if not, it will not expand. If exceeds tree threshold-1(7), and the length of the array exceeds 64, it will go to It is transformed into a red-black tree, and then it will also judge whether the threshold is exceeded, and the capacity will not be expanded if it is not exceeded.
    • 3.2. What is the expansion threshold of HashMap? Can we make changes? Why this number?

      • The default expansion threshold of HashMap is 12. It is obtained by default capacity * expansion factor. The default expansion factor is 0.75. When we create a new HashMap, we can specify the threshold and initial capacity to change the expansion threshold. The main reason why HashMap finally chooses 0.75 as the expansion threshold is that after a large number of Poisson distribution calculations, it is found that 0.75 as the expansion factor will not lead to frequent expansion and delayed expansion.

        HashMap<String, String> hashMap = new HashMap<String, String>(10, (float) 0.6);
        
        public HashMap(int initialCapacity, float loadFactor) {<!-- -->
            if (initialCapacity < 0)
                throw new IllegalArgumentException("Illegal initial capacity: " +
                                                   initialCapacity);
            if (initialCapacity > MAXIMUM_CAPACITY)
                initialCapacity = MAXIMUM_CAPACITY;
            if (loadFactor <= 0 || Float. isNaN(loadFactor))
                throw new IllegalArgumentException("Illegal load factor: " +
                                                   loadFactor);
            this. loadFactor = loadFactor;
            this.threshold = tableSizeFor(initialCapacity);
        }
        
    • 3.3. What is the expansion principle of HashMap? How to determine where the original data should be stored after expansion?

      • When HashMap expands, it will first judge whether the length of the old array exceeds the maximum value of Integeter (), if it exceeds, it will end directly. If it does not exceed the original 2 times, the table points to the new array, it will traverse and process the old array, and judge whether there is a node at the index position of the current traversal, if not, continue to traverse, If there is, it will judge whether the current node has only one node, if yes, calculate the index position of the node in the new table, put it directly in the new table, will judge whether the old array node has been processed, If there is no end to the words, continue to traverse. If there is currently more than one node, it will determine what the node is. If it is a red-black tree, will traverse and process the red-black tree node at the index position , to judge e.hash & amp; oldCap == 0 (the hash of the node), if yes, node will be added to the end of the lowhead linked list, if not, it will be added to hiHead At the end of the linked list, Then, the processing of lowhead is to put it in the original index position, the red-black tree linked list node conversion processing, the processing of hiHead is to put it in the original index+ The position of OldCap, **red-black tree linked list node processing conversion, will also judge whether the processing is completed. If this node is a linked list, it will traverse the linked list node processing the index position, go back and judge e.hash & amp; OldCap == 0, if yes, add it to the end of the lowhead linked list, if not, add it to the end of the HiHead linked list, lohead Put it in the original index position, put hihead in the original index + oldCap position, to judge whether the processing is completed, if it is not completed, repeat the above operation

      • A bitwise AND operation will be performed to determine whether it is stored in the original location, or the original location plus the original capacity length, and the hash of a key and the new capacity minus 1 are performed. After the calculation, two situations will appear:

        • A high bit is 1
        • one is 0
        • Store according to different high and low bits: After calculation, if the high bit is 0, it is the original position; if the high bit is 1, it is the original position + old capacity
        Carry out bitwise AND calculation, and calculate the expanded capacity -1, either the original position or the original position plus the old capacity
        
        
        Perform a new AND operation on the same index position
        (n-1) & hash
        0000 0000 0000 0000 0000 0000 0001 0000 16 (old array capacity)
        0000 0000 0000 0000 0000 0000 0000 1111 15 (n-1)
        Hash1 (key1) 1111 1111 1111 1111 0000 1111 0000 0101 the hash of a certain key
        -------------------------------------------------- -----------------------------
                     0000 0000 0000 0000 0000 0000 0000 0101 index is 5
        
        0000 0000 0000 0000 0000 0000 0000 1111 15 (n-1)
        Hash2 (key2) 1111 1111 1111 1111 0000 1111 0001 0101 the hash of a certain key
        -------------------------------------------------- -----------------------------
                    0000 0000 0000 0000 0000 0000 0000 0101 The calculated index is 5
        
        After expansion 16 * 2 = 32
                    0000 0000 0000 0000 0000 0000 0010 0000 32 (new capacity)
                    0000 0000 0000 0000 0000 0000 0001 1111 31(n-1)
        Hash1 (key1) 1111 1111 1111 1111 0000 1111 0000 0101 the hash of a certain key
        -------------------------------------------------- -----------------------------
                    0000 0000 0000 0000 0000 0000 0000 0101 Index position 5
                                
                    0000 0000 0000 0000 0000 0000 0001 1111 31(n-1)
        Hash2 (key2) 1111 1111 1111 1111 0000 1111 0001 0101 the hash of a certain key
        -------------------------------------------------- -----------------------------
                    0000 0000 0000 0000 0000 0000 0001 0101 Index position 21
        
        
    • 3.4. Why is the expansion factor of HashMap 2 to the nth power? What happens if you input 10

      • If the length of the array is not the nth power of 2, the calculated indexes are particularly likely to be the same and prone to hash collisions, causing the rest of the arrays to largely store no data in space.

      • Enter 10, it will be changed to the nth power of 2 through OR operation and right shift operation

        Through the OR operation and the right shift operation, it becomes the n power of 2, -1 operation, to prevent the n power of 2 from being given exactly
        static final int tableSizeFor(int cap) {<!-- -->//int cap = 10
                int n = cap - 1;
                n |= n >>> 1;
                n |= n >>> 2;
                n |= n >>> 4;
                n |= n >>> 8;
                n |= n >>> 16;
                return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
            }
        
        For example, if we enter 10
        n = cap -1(10 - 1) 9
        // first shift right
        n |= n >>> 1;
        00000000 00000000 00000000 00001001 //9
           |
        00000000 00000000 00000000 00000100 //9 becomes 4 after right shift
        -------------------------------------------------- ---------
        00000000 00000000 00000000 00001101 //13 after bitwise XOR
        
        n |= n >>> 2;
        
        00000000 00000000 00000000 00001101 //13
        00000000 00000000 00000000 00000011 //13 becomes 3 after being shifted to the right by two bits
        -------------------------------------------------- ---------
        00000000 00000000 00000000 00001111 //15 after bitwise XOR
        
        n |= n >>> 4;
        00000000 00000000 00000000 00001111 //15
        00000000 00000000 00000000 00000000 //15 becomes 0 after shifting four bits to the right
        -------------------------------------------------- --
        00000000 00000000 00000000 00001111 //15 after bitwise XOR
        n = 15
         
        return n + 1 (16)
        
    • complete flowchart

      [External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-4wgf1iJY-1679562452780)(.\images\1679558720524.png)]

4, HashSet

  • Features: It is implemented based on HashMap, the constructor directly creates a new HashMap, the value of hashSet is stored on the key of HashMap, and the value of hashmap is unified as Present. Basically, it is implemented by directly calling the underlying method of HashMap.
  • The value of HashSet does not allow duplicates
  • HashSet uses member objects to calculate the hashcode value. For two objects, the hashcode may be the same, so the equals() method is used to judge the equality of the objects. If the two objects are different, then return false.

5, HashTable

  • Similarities and differences between HashTable and HashMap
    • Both HashMap and HashTable implement the map interface
    • HashTable is thread-safe and locked by synchronized, but this will cause its efficiency to be slow, and hashmap thread is not safe
    • HashTable does not allow key-value pairs to be empty, and hashmapHashMap, HashMap will put the null key directly in the first position

6. ConcurrentHashMap

  • Features: concurrent security, automatic expansion

  • Storage features: The bottom layer uses arrays, linked lists, and red-black trees, and a large number of CAS operations are used internally. Concurrency control uses?synchronized and

    CAS to operate to achieve.

  • currentHashMap array initialization

    • The initialization uses CAS + spin -> thread-safe thread guarantee
    • 1. First judge that the previous sizeCtl is not less than 0, if yes, there are other threads being initialized,
    • 2. Then, a U.compareAndSwapInt(this, SIZRCTL, sc, -1) will be performed to make a judgment, modify the current sizeCtl value to -1, and if the modification is successful, the array will be initialized. When initializing, There is an operation to calculate the threshold, which is n-(n>>>2), which is equivalent to 0.75. If it fails, it will spin and wait (looping forever). Finally, the calculated expansion threshold will be assigned to sizeCtl. —-->Calculate the expansion threshold->It’s wonderful, it isn-(n>>>2), which is equivalent to 0.75
  • The meaning of the Sizectl variable

    ?

    SizeCtl value meaning
    0 Indicates that the array is not initialized, and the initial capacity of the array is 16
    1 If the array is not initialized, then it records the initial capacity of the array, If the array has been initialized, the record is the expansion threshold of the array
    -1 Indicates that the array is being initialized
    is less than 0 and not -1 means that the array is expanding, -(1 + n) means that there are n threads working together to complete the array expansion operation
  • currentHashMap expansion principle?

    • When expanding capacity, it will first judge whether the number of cups is greater than 1. If yes, it will divide tasks for each thread ( move the length of the array to the right by 3 digits and divide by the number of CPUs, otherwise it will be directly equal to n-ternary operator), and then judge whether the current thread is an expansion thread, if so, expand the new array by twice, record the bucket position where the thread starts to migrate, and then start migration from the back to the front. A fwd node will be used to occupy the bucket position that has been migrated, indicating that it has been migrated. The hash value in this node is moved-> -1. If there is no migration, it will lock, lock the bucket with a code block (synchronized (f) {}), and perform data migration (the specific migration is similar to hashMap) Threads will only be responsible for their own Migration tasks for

    • The operation of multithreading to assist expansion will be triggered in two places

      • 1. When adding an element, if it is found that the bucket corresponding to the added element is a fwd node, it will help expand the capacity, and then add the element
      • 2. After adding elements, judge whether the current element has reached the expansion threshold. At this time, the value of the method sizeCtl is less than 0, and when the new array is not empty, it will assist in expansion
  • The principle of currentHashMap Put

  • Why is it not supported that the key or value is null?

    • The value is null, because concurrentHashMap is multi-threaded. When the value is obtained by get (key) and null is obtained, it is impossible to judge whether the map is null or the corresponding key is not found. It is null. Under the single-threaded hashmap, you can use containskey to judge whether it is then Contains this null. Also, if ConcurrentHashMap is allowed to store a value with a value of null, there are two threads at this time, and the T1 thread calls get (key) to return null. We don’t know whether this null is not mapped to null or it is originally null. If this When we call ContaintsKey to verify, we expect to get false, but if another thread T2 executes put (key, null) between calling get and ContainsKey, then we call ContainsKey What is returned is true, that is, it does not meet the assumption. ambiguity

      As for why the key cannot be null, I guess that the author Doug Lee doesn’t like null. It seems that it is written in the comment that it is difficult to check the empty key and value in a concurrent situation. The key is not allowed to be null at the beginning of the design. Couldn’t find an official explanation

  • Does the get method of ConcurrentHashMap need to be locked?

    • The get method does not need to be locked, because both the element val and the pointer next of Node are decorated with volatile. Under multi-threading conditions, when thread T1 modifies the node’s val or adds a new node, multi-thread T2 is visible of.

    • Does the Get method not need to be locked related to the hash bucket modified by volatile?

      It doesn’t matter, the hash bucket table is decorated with volatile mainly to ensure the visibility when the array is expanded.

syntaxbug.com © 2021 All Rights Reserved.