Why use hashing algorithm? Hand-shred HashSet: add, delete, exist, expand

#I have been practicing algorithms and structures recently, and their importance is self-evident. What I’m sharing today is a hand-shredded HashSet. Let’s start from the interviewer’s perspective#

1. Interviewer: Please tell me about your understanding of HashSet and why you should use the hash algorithm

Little Z: Let’s talk about its official definition first (list these to the interviewer first and then explain them one by one)

1. Understanding:

1. HashSet is a set that does not allow duplicate elements.

2. It implements the Set interface and inherits the AbstractSet abstract class.

3. HashSet is implemented based on HashMap. The underlying implementation of the set is a hash table, which uses a hash algorithm to store and manage the elements in the set.

4.The elements in the HashSet collection are not ordered.

5.HashSet allows null values.

6.HashSet is not thread-safe.

2. Why use hash algorithm (benefits)

1.Performance: HashSet has fast lookup performance. Due to the design of hash tables, the time complexity required to find an element is O(1) on average. But in the worst case, when hash collisions occur more often, the lookup time may increase to O(n), where n is the number of elements. The big O notation will not be introduced in detail here. Simply put, it is because of its duplication characteristics and partition comparison, so it is very fast to search. (Don’t understand? Picture above)

This is the case without using hashSet: as shown below

The following is the situation of using hashSet. Imagine that you are traveling with an extra-long suitcase and classify the luggage. Each major category has a hash code. For example, the hash code of a change of clothes is 2. When a new pair of pants is needed, the hash code of the pants is also calculated to be 2. At this time, there is no need to search the entire suitcase (HashSet). You only need to open the bucket with a hash code of 2 and search to see if there are the same pants. If there are the same pants, give up adding them. If not, add them. The strict expression is as follows: When an element is to be inserted, HashSet will first calculate the hash code of the element, and then determine which bucket to put the element into. If multiple elements have the same hash code, they are put into the same bucket and form a linked list.

Benefits: Improved performance, no need to compare each time you add, but partition comparison.

Disadvantages: It takes up space, because it is hashed, so space is exchanged for time. For example, there are 10 empty positions in the suitcase, which may not all be full, and each major category is not closely connected, and the empty positions are randomly selected.

Tip: The hash codes with the same content are the same; conversely, the hash codes are the same, but the content is not necessarily the same. eg: Toothbrush and toothpaste both belong to toiletries and have the same hash code. But the hash codes are the same, they are different.

The interviewer may continue to ask: Why is HashSet thread-unsafe?

Little Z: Because the add method of HashSet calls the put method of HashMap at the bottom. As for why HashMap is thread-unsafe, please jump to the link to continue learning.

3. Shred HashSet by hand

The HashSet class is located in the java.util package and needs to be introduced before use. The syntax format is as follows:

import java.util.HashSet; //Introduce the HashSet class