Talking about the underlying expansion mechanism of HashSet and HashMap

Article directory

  • foreword
  • 1. A brief introduction to the underlying implementation of HashSet
  • 2. What is the expansion mechanism of HashTable?
  • 3. Experiment of array expansion
    • 1. Before the experiment, you must first set up IDEA
    • 2. Add the first piece of data
  • 3. Experiment of linked list expansion
    • step 1
    • step 2
  • 4. Supplementary instructions
  • Summarize

Foreword

The expansion mechanism of HashSet and HashMap has always been heard from others. When asked, many people can’t understand it, or they just look at the source code. If novices can understand the source code, why do you need a boss? I have always wanted to see the instant expansion of HashSet intuitively, but it involves reflection and so on, which is very troublesome. But recently, I suddenly discovered a little trick, and now I will take everyone to have an intuitive look at how HashSet expands.

Reminder: The following experiments are limited to IDEA

1. A brief introduction to the underlying implementation of HashSet

The bottom layer of HashSet is implemented based on HashTable. Before jdk8, HashTable is an array + linked list; after jdk8, it is an array + linked list + red-black tree

2. What is the expansion mechanism of HashTable?

I believe everyone knows what the expansion mechanism of HashSet is, even if you don’t know, you will know after reading this article~
When HashSet is initially created, a HashSet collection with an initial capacity of 0 will be created at the bottom layer (actually it is a HashMap collection, but the Value value is not displayed)
When the first element is added, the HashTable will be expanded for the first time, and the length of the array becomes 16. Friends who have studied ArrayList know that it is right to expand the array when it is full~ But it is different here in HashTable. HashTable has an expansion factor of 0.75, which means that when the array capacity of HashTable reaches 75% of the maximum, it will trigger Expansion, each expansion, the capacity will be doubled to the original. Next, I will take HashSet as a case to give you an intuitive understanding of the expansion mechanism of HashTable.

3. Experiment of array expansion

1. Before the experiment, you must first set up IDEA


The reason for setting this is that you can intuitively see the change in the capacity of the HashSet when you wait for the Debug to debug~
Once set up, you can start experimenting
First create a HashSet collection, and prepare to add a lot of elements to it, but we set a breakpoint at the beginning.
The code is as follows (example):

import java.util.HashSet;

public class Test {<!-- -->
    public static void main(String[] args) {<!-- -->
        HashSet<Integer> set = new HashSet<>();
        int i = 0;
        while (i < 2000) {<!-- -->
            i + = 1;

            set. add(i);

        }
        System.out.println(set.size());
    }
}

The breakpoint is hit here in the while loop, as shown in the following figure:

Next run with Debug, we observe the contents of the debugger

After running, the console looks like this, click on the small arrow next to set

We mainly observe the dynamic changes of the size and threshold variables. size is the size of the array, threshold is the capacity threshold, which is the maximum capacity of the array multiplied by the following loadFactor (expansion factor)

So far we have confirmed that when the HashSet is initially created, the capacity and length are both 0.

Next we start to experiment with the next step:

2. Add the first data


After Debug executes to the current added element, click Next again and you will find a magical phenomenon:

size is the length of the array, only one piece of data is added, the length is naturally 1, but the threshold, which is the threshold mentioned above, is equal to 12, which confirms what was said before, when the first piece of data is added, the array will be processed For the first expansion, the size of the expansion is 16, but the critical point of expansion is the threshold, and the threshold is equal to 16*0.75.
Next, friends, click Next until the length is 12. After adding the 13th piece of data, the threshold will change again and expand to 24. When the threshold reaches 24, it means that the length of the current array is 32, which also confirms the beginning. That is to say, each expansion is twice the original size.

3. Experiment of linked list expansion

Do you remember that it was mentioned earlier that the bottom layer of HashSet is composed of array + linked list + red-black tree? In fact, my confusion at the beginning was whether the elements in the linked list occupied the position of the array. In the end, when the number of elements in the array reaches 12, the capacity will be expanded, or if the sum of the elements of the array and linked list in the entire HashSet adds up to 12, the capacity will be expanded.

That’s right, let’s do a special experiment now, only add data to the linked list, and then look at the change in capacity.

It is not enough to use the integer type alone here. Let’s change the code:

Step 1

Add an Animal class, set a name and an age in the class, and then use shortcut keys to arrange a constructor with full parameters, a get method, and rewrite the hashCode and equals methods.

Note: I rewritten the hashCode method here, and its return value is always 0. When the bottom layer is judged, because the hashCode value is the same, all elements will be added to the same position. After jdk8, the new data will be linked to the original data. back, like a chain.

The code is as follows (example):

public class Animal {<!-- -->
    private String name;
    private int age;

    public String getName() {<!-- -->
        return name;
    }
    
    public int getAge() {<!-- -->
        return age;
    }

    public Animal(String name, int age) {<!-- -->
        this.name = name;
        this. age = age;
    }

    @Override
    public boolean equals(Object o) {<!-- -->
        if (this == o) return true;
        if (!(o instanceof Animal)) return false;

        Animal animal = (Animal) o;

        if (getAge() != animal. getAge()) return false;
        return getName() != null ? getName().equals(animal.getName()) : animal.getName() == null;
    }

    @Override
    public int hashCode() {<!-- -->
        return 0;
    }
}

Step 2

Rewrite the test code above

The code is as follows (example):

public class Test {<!-- -->
    public static void main(String[] args) {<!-- -->

        HashSet<Animal> animals = new HashSet<>();
        int i = 0;
        while (i < 2000) {<!-- -->
            i + = 1;
            animals.add(new Animal("cat", i));
        }
        System.out.println(animals.size());
    }
}

It is still the same, the breakpoint is hit while, and then Debug~

The beginning is still the same, but when the size is equal to 8, pay attention! ! ! When the length of the linked list is equal to 8,
Add the 9th piece of data, and the array will be expanded, twice the original size! ! !

Eh? ? ? Doesn’t it mean that the capacity will expand when the number of elements reaches 12? Now how to expand the capacity of 8. . . After checking a lot of information, I can’t find anything that can explain it clearly, and I can’t understand the source code. I will write down this point as a knowledge point, and ask the gods in the comment area for guidance.

Friends who continue to experiment will find that not only 8 is the expansion point in the linked list, but also 9 is the expansion point. When it is added to the 10th element, it will be expanded again, and the threshold will be expanded to 48, that is, the length of the array will be expanded to 64. But the next expansion will not continue until there are 48 elements.

4. Supplementary explanation

When the capacity of the array reaches 64and the length of the single linked list is greater than 8, the HashSet will become a red-black tree structure~

Summary

Charging time
Whether it is HashSet or HashMap, their expansion mechanisms are:

When the sum of the number of array elements + linked list elements reaches the threshold, it will expand, but it is not absolute. When the length of the linked list reaches 8 or 9, regardless of whether there are elements in the array or whether the total number reaches the threshold, the capacity will be expanded directly! As for why the linked list expands when the length is 8 or 9, please give me some advice from the big guys in the comment area~

I have seen a lot of articles and bloggers say that the concept of this point is very vague