Hash collision
- 1. When does a hash conflict occur?
- 2. Linear detection method
-
- 2.1 Code implementation
- 3.Chain address method
-
- 3.1 Code implementation
1, When does a hash conflict
First, when storing elements in a hash table, the hashcode() calculation result of the value is initially stored in the array as a subscript. The size of the array is at least equal to the hashcode() value. There is a problem: if there is only one element, but he When the hashcode value is very large, there will be a serious waste of heap memory. A hash conflict occurs when multiple elements have the same hashcode() return value.
1. When 8, 1, 2, and 7 are put into the array, a hash conflict will occur when 6 is placed again.
When the array size is 4
Putting process:
- 8%4=0 puts 8 into subscript 0
- 1%4=1 put it into subscript No. 1
- 2%4=2 put subscript number 2
- 7%4=3 put it at subscript No. 3
After expansion, it needs to be rehashed.
Default size: 16
Expansion factor: 0.75
2. Linear detection method
Idea: Mainly solve the hash conflict problem based on index = val.hashcode()%element.length. Starting from the current position of the array (hashcode()), find the next empty position and store it. When the array is filled, expand the array and continue to store values. If the keys in the stored key-value pair are equal, the value overwriting operation is performed. After the expansion operation, the elements must be rehashed.
After expansion, it needs to be rehashed.
Default size: 16
Expansion factor: 0.75
2.1 Code Implementation
map:
public interface Map<K,V>{<!-- --> void put(K key,V value);//Add boolean containKey(K key);//Whether the key is included void remove(K key);//Delete k V get(K key);//Find k and return the corresponding value }
//Linear detection method public class MyHashMap<K extends Comparable<K>,V> implements Map<K,V>{<!-- --> private Entry<K,V>[] element; private int size;//Number of valid nodes public MyHashMap(){<!-- --> element = new Entry[16]; } /* rehash */ private void rehash(Entry[] newArray){<!-- --> //The original data is traversed in sequence and re-hashed into the new array int index; for(int i = 0;i< element.length;i + + ){<!-- --> if(element[i] != null){<!-- --> index = element[i].key.hashCode()% newArray.length; if(newArray[index]==null){<!-- --> newArray[index] = element[i]; }else {<!-- --> for (int j = (index + 1) % newArray.length; j != index; j = ( + + j) % newArray.length) {<!-- --> if (newArray[j] == null) {<!-- --> newArray[j] = element[i]; break; } } } } } element = newArray; } @Override public void put(K key, V value) {<!-- --> //1. If the array is "full", expansion factor => 0.75 (cannot be completely full, high efficiency), expand and re-hash if((double)size/element.length >=0.75){<!-- --> Entry[] newElement = new Entry[element.length*2]; rehash(newElement); } //2.index? If there is a hash conflict, search a circle starting from the current position and find an empty position to insert. int index = key.hashCode()% element.length;//Get the subscript of the inserted data if(element[index] == null){<!-- --> element[index] = new Entry<>(key,value); }else{<!-- --> //Element exists for(int i = (index + 1)%element.length;i != index;i=( + + i)% element.length){<!-- --> if(element[i]==null){<!-- --> element[i] = new Entry<>(key,value); break; } } } size + + ; } @Override public boolean containKey(K key) {<!-- --> for(int i = 0;i < element.length;i + + ){<!-- --> if(element[i] != null){<!-- --> if(element[i].key.compareTo(key)==0){<!-- --> return true; } } } return false; } @Override public void remove(K key) {<!-- --> for(int i = 0;i < element.length;i + + ){<!-- --> if(element[i] != null){<!-- --> if(element[i].key.compareTo(key)==0){<!-- --> element[i]=null; } } } } @Override public V get(K key) {<!-- --> for(int i = 0;i < element.length;i + + ){<!-- --> if(element[i] != null){<!-- --> if(element[i].key.compareTo(key)==0){<!-- --> return element[i].value; } } } return null; } static class Entry<K,V>{<!-- --> private K key; private V value; public Entry(K key,V value){<!-- --> this.key = key; this.value = value; } } }
3. Chain address method
Linear detection method: The more key-value pairs that produce hash conflicts, the more times the array will be traversed. In order to solve this problem, we use the combination of array and linked list. The array element is a linked list structure, resulting in a hash conflict, and the insertion operation of the linked list is performed under the same index subscript.
shortcoming:
① The disadvantage of the chain address method is that when there are more nodes that cause hash conflicts, the array index subscript search will become a linked list traversal operation. The advantage of arrays is that they randomly select values, and the time complexity reaches O(1), while the search time efficiency of linked lists reaches O(n).
② When array expansion occurs, all element nodes must be rehashed.
3.1 code implementation
Rehash:
import src15.Map; public class MyLinkedHashMap<K extends Comparable<K>,V>implements Map<K,V> {<!-- --> private Node<K,V>[] element;//Node node type private int node_size;//Save the current number of nodes private int array_size;//valid number of arrays private static final double LOADFACTOR = 0.75; private static final int INITSIZE = 10; //structure public MyLinkedHashMap(){<!-- --> element = new Node[INITSIZE]; } private void rehash(Node<K,V>[] array){<!-- --> array_size=0; for(int i = 0;i<element.length;i + + ){<!-- -->//Traverse the original array while(element[i] != null){<!-- --> Node<K,V> p = element[i];//p is used to delete nodes and insert heads into array element[i] = element[i].next;//p is independent from the current linked list int arrayIndex = p.value.key.hashCode()% array.length;//The subscript inserted into the new array if(array[arrayIndex]==null){<!-- --> array_size + + ; } p.next = array[arrayIndex]; array[arrayIndex] = p;//Insert the p header into the array } } element = array; } @Override public void put(K key, V value) {<!-- --> //map features: same keys, value overwriting Node.Entry<K,V> entryKV = containKey0(key); if(entryKV !=null){<!-- -->//Found, value overwrite entryKV.value=value; return; } //The key does not exist, insert operation if (1.0 * array_size / element.length >= LOADFACTOR) {<!-- -->//Expansion Node<K, V>[] newArray = new Node[element.length * 2]; rehash(newArray); } //1. According to the key value, find the index -> insert the head of the singly linked list int index = key.hashCode()% element.length; if(element[index] == null){<!-- --> array_size + + ; } //2. Head insertion method Node.Entry<K,V> entry = new Node.Entry<>(key,value); Node<K,V> node = new Node<>(entry); node.next = element[index];//Head insertion, the next of the new node saves the original "head' element[index] = node; // Save the new starting position of the head node_size + + ; } private Node.Entry containKey0(K key){<!-- --> int index = key.hashCode()% element.length; for(Node<K,V> p = element[index];p!=null;p=p.next){<!-- --> if(p.value.key.compareTo(key) == 0){<!-- --> return p.value; } } return null; } @Override public boolean containKey(K key) {<!-- --> return containKey0(key) != null; } @Override public void remove(K key) {<!-- --> } @Override public V get(K key) {<!-- --> Node.Entry<K,V> entry = containKey0(key); if(entry == null){<!-- --> return null; }else{<!-- --> return entry.value; } } //Singly linked list node type static class Node<K,V>{<!-- --> private Entry<K,V> value; private Node<K,V> next; //structure public Node(Entry<K, V> value) {<!-- --> this.value = value; } //Key value pair static class Entry<K,V>{<!-- --> private K key; private V value; public Entry(K key, V value) {<!-- --> this.key = key; this.value = value; } } } }
You have to study hard today~