Why does HashMap need to be inserted first and then expanded to JDK1.8?

We know that the bottom layer of HashMap is composed of arrays, linked lists, and red-black trees. When HashMap is expanded, in addition to expanding the array capacity to twice the original size, the hash value of all elements will also be recalculated, because after the length is expanded , the hash value also changes accordingly. (jdk1.7 is expanded first and then inserted)

If it is a simple Node object, you only need to recalculate the subscript and put it in. If it is a linked list and a red-black tree, the operation will be more complicated. Let’s take a look at the expansion of HashMap under JDK1.8. What optimizations have been made to linked lists and red-black trees?

How to deal with the linked list when rehash?

Assume that the original bucket size of a HashMap is 16. 19, 3, and 35 at the position of subscript 3 form a linked list due to index conflict.

When the HashMap is expanded from 16 to 32, 19, 3, and 35 are hashed again and split into two linked lists.

Looking at the source code of JDK1.8 HashMap, we can see that the optimization operations on the linked list are as follows:

// Split the original linked list into two linked lists
// Linked list 1 is stored in the low position (original index position)
Node<K,V> loHead = null, loTail = null;
// Linked list 2 is stored in the high position (original index + old array length)
Node<K,V> hiHead = null, hiTail = null;
Node<K,V> next;
do {
    next = e.next;
    // linked list 1
    if ((e.hash & amp; oldCap) == 0) {
        if (loTail == null)
            loHead = e;
        else
            loTail.next = e;
        loTail = e;
    }
    // linked list 2
    else {
        if (hiTail == null)
            hiHead = e;
        else
            hiTail.next = e;
        hiTail = e;
    }
} while ((e = next) != null);
// Linked list 1 is stored at the original index position
if (loTail != null) {
    loTail.next = null;
    newTab[j] = loHead;
}
// Linked list 2 stores the original index plus the offset of the old array length
if (hiTail != null) {
    hiTail.next = null;
    newTab[j + oldCap] = hiHead;
}

Normally, we recalculate the subscript values of all elements and then decide which bucket to put them in. JDK1.8 is optimized to directly split the linked list into high and low bits, and use bit operations to decide whether to place it at the original index or add the original index. The offset from the length of the original array. Let’s analyze it through bit operations.

Let’s first review the remainder process of the original hash:

Let’s take a look at the bit operations performed during rehash, which is the sentence e.hash & amp; oldCap:

Let’s look at the actual remainder process after expansion:

Isn’t this wave of operations very 666? Why can the integer power of 2 – 1 be used as the & amp; operation can replace the remainder calculation? Because the binary system of the integer power of 2 – 1 is quite special, it is a string of 11111, and this string of numbers 1 can be used & amp; operation, the result is to retain the low bits of the original number and remove the high bits of the original number to achieve the remainder effect. The binary system of integer powers of 2 is also quite special, that is, a 1 followed by a string of 0s.

The expansion of HashMap is to double the original size. From a binary point of view, it means adding a 0 to this string of numbers, such as 16 -> 32 = 10000 -> 100000, then its n – 1 is 15 -> 32 = 1111 ->11111. That is, there is one more digit, so the expanded subscript can be calculated from the original subscript. The difference lies in the place marked in red in the picture above. If the place marked in red is 0, then the result of finding the remainder remains unchanged after expansion. If the place marked in red is 1, then finding the remainder after expansion The remainder is the original index + original offset. How to determine whether the red mark is 0 or 1 is to use e.hash & oldCap.

Notice:

From the above example, we can see that if the hashcode is the same, the position of the element must be the same. If the hashcode is different, the position of the element may also be the same. Therefore, after expansion, the elements of the linked list may be distributed in different positions of the new array, because they The hashcode may be different.

How to deal with red-black trees when rehash?

//Red-black tree to linked list threshold
static final int UNTREEIFY_THRESHOLD = 6;
 
// Expansion operation
final Node<K,V>[] resize() {
    // ....
    else if (e instanceof TreeNode)
       ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
    // ...
}
 
final void split(HashMap<K,V> map, Node<K,V>[] tab, int index, int bit) {
    TreeNode<K,V> b = this;
    // Relink into lo and hi lists, preserving order
    //The same routine as the linked list, divided into high and low bits
    TreeNode<K,V> loHead = null, loTail = null;
    TreeNode<K,V> hiHead = null, hiTail = null;
    int lc = 0, hc = 0;
    /**
      * TreeNode is indirectly inherited from Node, retains next, and can be traversed like a linked list.
      * The operation here is exactly the same as that of a linked list
      */
    for (TreeNode<K,V> e = b, next; e != null; e = next) {
        next = (TreeNode<K,V>)e.next;
        e.next = null;
        // bit is oldCap
        if ((e.hash & amp; bit) == 0) {
            if ((e.prev = loTail) == null)
                loHead = e;
            else
            // tail insert
                loTail.next = e;
            loTail = e;
             + + lc;
        }
        else {
            if ((e.prev = hiTail) == null)
                hiHead = e;
            else
                hiTail.next = e;
            hiTail = e;
             + + hc;
        }
    }
 
    //Tree low-order linked list
    if (loHead != null) {
        // If loHead is not empty and the length of the linked list is less than or equal to 6, convert the red-black tree into a linked list
        if (lc <= UNTREEIFY_THRESHOLD)
            tab[index] = loHead.untreeify(map);
        else {
            /**
              * When hiHead == null, it indicates that after expansion,
              * All nodes are still in their original positions, the tree structure remains unchanged, and there is no need to re-tree
              */
            tab[index] = loHead;
            if (hiHead != null) // (else is already treeified)
                loHead.treeify(tab);
        }
    }
    //Tree high-order linked list, the logic is consistent with the above
    if (hiHead != null) {
        if (hc <= UNTREEIFY_THRESHOLD)
            tab[index + bit] = hiHead.untreeify(map);
        else {
            tab[index + bit] = hiHead;
            if (loHead != null)
                hiHead.treeify(tab);
        }
    }
}

It can be seen from the source code that the logic of splitting the red-black tree is basically the same as that of the linked list. The difference is that after remapping, the red-black tree will be split into two linked lists. Based on the length of the linked list, it is judged whether it is necessary to split the red-black tree into two linked lists. The linked list is re-treed.

Critical Point Situation 1

Assume that the currently inserted node is a red-black tree structure with 7 nodes. If one node is moved to another index after expansion, the current red-black tree node will become 6 and it will be downgraded to a linked list and then inserted into A linked list of 7 nodes.
And if it is inserted first, it will be an 8-node red-black tree. Then, even if 1 node is moved to another position after expansion, there will still be 7 nodes and it will not be converted into a linked list.
Moreover, even if a red-black tree with more than 7 nodes is expanded first, it may be reduced to a linked list.
When expanding first and then inserting, if two consecutive nodes with the same index position are inserted into the linked list, it will become a red-black tree again.

Critical point situation 2

Contrary to the logic of situation 1, if the currently inserted nodes are 8 and it is a linked list, if 1 node is moved to another index after expansion first and then inserted again, the 8-node linked list will not turn into a red-black tree. , and if it is inserted first, it will become a red-black tree and then expanded. When 1 node is moved, it will become a red-black tree with 8 nodes. If the node is removed at this time, it may be reduced to a linked list. Let’s first analyze the removal of the node. Source code for descending linked list conditions

//This is part of the source code when removed
if (root == null || root.right == null || (rl = root.left) == null || rl.left == null) {
       tab[index] = first.untreeify(map); // too small
       return;
}

It can be seen that when removing nodes, it is not necessarily 6 nodes that will cause the red-black tree to be downgraded to a linked list. According to the characteristics of the red-black tree, it is possible that the current linked list will be downgraded to a linked list only when 3 <= Nodes <= 6.

And only when the linked list has 8 nodes, it will turn into a red-black tree when inserted first.
When inserting first and then expanding, if at least 2 and up to 5 nodes are removed continuously, it will be reduced to a linked list.

Conclusion: By comparing the above two critical situations, it is obvious that if the capacity is expanded first and then inserted, the probability of frequent changes is higher than if the capacity is inserted first and then expanded, the probability of frequent changes

Here is also a test code that simulates a HashMap to create a red-black tree. When the red-black tree has 7 nodes, expanding from 64 to 128 causes the node with key 64 to be moved to the 64th index, and the red-black tree has 7 nodes. The tree becomes a linked list

public static void main(String[] args) {
        //The default size of hashMap is 64. As long as the array length is >= 64 and the linked list is >= 8, it will be converted to a red-black tree.
        HashMap<Integer, String> map = new HashMap<>(64);
        map.put(0, "Wang Er Mazi 0");
        map.put(64, "Wang Er Mazi 1");
        map.put(128, "Wang Er Mazi 2");
        map.put(256, "Wang Ermazi 3");
        map.put(512, "Wang Er Mazi 4");
        map.put(1024, "Wang Er Mazi 5");
        map.put(2048, "Wang Er Mazi 6");
        map.put(4096, "Wang Er Mazi 7");
        map.put(8192, "Wang Er Mazi 8");
        //The above steps will generate a 9-node red-black tree structure at the 0th index.
        map.remove(8192);
        map.remove(4096);
        //After removing two, there is now a red-black tree with 7 nodes.
        //This step adds 41 nodes at other positions, as long as they are not on node 0, so that the total number of nodes reaches 64*0.75 = 48
        for (int i = 1; i <= 41; i + + ) {
            map.put(i, "Zhang San" + i);
        }
        //The expansion is triggered when inserting the 49th one. After that, the red-black tree structure becomes a linked list structure.
        map.put(42, "李思");
    }

—————-
Copyright statement: This article is an original article by CSDN blogger “Programmer Koi” and follows the CC 4.0 BY-SA copyright agreement. Please attach the original source link and this statement when reprinting.
Original link: https://blog.csdn.net/m0_68102643/article/details/124771393