The deduplication principle of HashSet

The set collection has no index value and cannot be repeated. The bottom layer is map.

When adding an element, the hashCode() method will be called first to calculate the hash value of the object, and then use the hash value % of the array length to calculate the index value position of the new element; if there is no element at the index value position, the new element will be added directly. increase,

If there is an element at the index value position, the equals() method is called to determine whether the two elements are repeated; if they are repeated, they will not be added; if they are not repeated, they will be added.

Look at the code implementation for details:

//Case 1:

The generic type of the Set collection here is String, which is a class that comes with the JDK. Hold down the ‘ctrl’ key, click on String, and view the source code: it is found that it overrides the hashCode() and equals() methods of the Object class, so it will Automatically call methods to determine and remove duplicates.

Test class: add a few pieces of data and see the output results

import java.util.HashSet;
import java.util.Set;

public class Test01 {
public static void main(String[] args) {
Set<String> set = new HashSet<>();
set.add("Zhang San");
set.add("李思");
set.add("王五");
set.add("Zhang San");
\t\t
System.out.println(set.size());
System.out.println(set);
}

}

The output is:

3
[Li Si, Zhang San, Wang Wu]

//Case 2:

Next, we customize the generic type for the Set collection and see what happens:

Customize a Student class to provide basic get and set methods, no-parameter construction and parameterized construction, and toString() method

Then create a new set collection, add a few student objects, and print to see the results.

Package HashSet deduplication principle;

import java.util.HashSet;
import java.util.Set;

public class Test02 {
public static void main(String[] args) {
//Custom generic: Student
Set<Student> set = new HashSet<>();
set.add(new Student("张三", 22, 1000));
set.add(new Student("李思", 33, 2000));
set.add(new Student("王五", 44, 3000));
set.add(new Student("张三", 22, 1000));
\t\t
System.out.println(set.size());
for (Student stu : set) {
System.out.println(stu);
}
}

}

class Student{
private String name;
private int age;
private int money;
\t
\t
@Override
public String toString() {
return "Student [name=" + name + ", age=" + age + ", money=" + money + "]";
}
public Student() {
super();
// TODO Auto-generated constructor stub
}
public Student(String name, int age, int money) {
super();
this.name = name;
this.age = age;
this.money = money;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public int getAge() {
return age;
}
public void setAge(int age) {
this.age = age;
}
public int getMoney() {
return money;
}
public void setMoney(int money) {
this.money = money;
}
}

The console output is:

4
Student [name=王五, age=44, money=3000]
Student [name=Zhang San, age=22, money=1000]
Student [name=李思, age=33, money=2000]
Student [name=Zhang San, age=22, money=1000]

The customized Student class here does not override the hashCode() and equals() methods of the Object class, so no duplication judgment will be performed

//Case 3:

This time, based on situation two, we add Add hashCode() and equals() methods to the student class, and add an output statement to it, see the result

(Shortcut key: alt + shift + s)

Package HashSet deduplication principle;

import java.util.HashSet;
import java.util.Set;

public class Test02 {
public static void main(String[] args) {
//Custom generic: Student
Set<Student> set = new HashSet<>();
set.add(new Student("张三", 22, 1000));
set.add(new Student("李思", 33, 2000));
set.add(new Student("王五", 44, 3000));
set.add(new Student("张三", 22, 1000));
\t\t
System.out.println(set.size());
for (Student stu : set) {
System.out.println(stu);
}
}
}

class Student{
private String name;
private int age;
private int money;
\t
@Override
public int hashCode() {
System.out.println("The hashCode method is executed to determine whether there is an element at the index value"); //
final int prime = 31;
int result = 1;
result = prime * result + age;
result = prime * result + money;
result = prime * result + ((name == null) ? 0 : name.hashCode());
return result;
}
@Override
public boolean equals(Object obj) {
System.out.println("The equals method is executed to determine whether the old and new elements are repeated"); //
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Student other = (Student) obj;
if (age != other.age)
return false;
if (money != other.money)
return false;
if (name == null) {
if (other.name != null)
return false;
} else if (!name.equals(other.name))
return false;
return true;
}
@Override
public String toString() {
return "Student [name=" + name + ", age=" + age + ", money=" + money + "]";
}
public Student() {
super();
// TODO Auto-generated constructor stub
}
public Student(String name, int age, int money) {
super();
this.name = name;
this.age = age;
this.money = money;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public int getAge() {
return age;
}
public void setAge(int age) {
this.age = age;
}
public int getMoney() {
return money;
}
public void setMoney(int money) {
this.money = money;
}
}

The output is:

The hashCode method is executed to determine whether there is an element at the index value position.
The hashCode method is executed to determine whether there is an element at the index value position.
The hashCode method is executed to determine whether there is an element at the index value position.
The hashCode method is executed to determine whether there is an element at the index value position.
The equals method is executed to determine whether the old and new elements are repeated.
3
Student [name=李思, age=33, money=2000]
Student [name=Zhang San, age=22, money=1000]
Student [name=王五, age=44, money=3000]

We found that the hashCode() method was called 4 times here. Because four new elements were added, it was judged 4 times;

The equals() method is only called once. Because the first and fourth elements are repeated among the new elements, it is only called once.

Summary: When adding elements to the Set collection:

① If the generic is a class provided by JKD, you only need to provide the basic get and set methods, parameterless construction, parameterized construction, and toString() method to achieve deduplication.

② If the generic is a custom class, you must add the hashCode() and equals() methods to achieve deduplication.


The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. Algorithm skill tree Home page Overview 57169 people are learning the system