java—–Hash conflict

Hash collision

  • 1. When does a hash conflict occur?
  • 2. Linear detection method
    • 2.1 Code implementation
  • 3.Chain address method
    • 3.1 Code implementation

1, When does a hash conflict

First, when storing elements in a hash table, the hashcode() calculation result of the value is initially stored in the array as a subscript. The size of the array is at least equal to the hashcode() value. There is a problem: if there is only one element, but he When the hashcode value is very large, there will be a serious waste of heap memory. A hash conflict occurs when multiple elements have the same hashcode() return value.
1. When 8, 1, 2, and 7 are put into the array, a hash conflict will occur when 6 is placed again.

When the array size is 4
Putting process:

  • 8%4=0 puts 8 into subscript 0
  • 1%4=1 put it into subscript No. 1
  • 2%4=2 put subscript number 2
  • 7%4=3 put it at subscript No. 3

After expansion, it needs to be rehashed.
Default size: 16
Expansion factor: 0.75

2. Linear detection method

Idea: Mainly solve the hash conflict problem based on index = val.hashcode()%element.length. Starting from the current position of the array (hashcode()), find the next empty position and store it. When the array is filled, expand the array and continue to store values. If the keys in the stored key-value pair are equal, the value overwriting operation is performed. After the expansion operation, the elements must be rehashed.


After expansion, it needs to be rehashed.
Default size: 16
Expansion factor: 0.75

2.1 Code Implementation

map:

public interface Map<K,V>{<!-- -->
    void put(K key,V value);//Add
    boolean containKey(K key);//Whether the key is included
    void remove(K key);//Delete k
    V get(K key);//Find k and return the corresponding value
}

//Linear detection method
public class MyHashMap<K extends Comparable<K>,V> implements Map<K,V>{<!-- -->

    private Entry<K,V>[] element;
    private int size;//Number of valid nodes

    public MyHashMap(){<!-- -->
        element = new Entry[16];
    }

    /*
    rehash
     */
    private void rehash(Entry[] newArray){<!-- -->
        //The original data is traversed in sequence and re-hashed into the new array
        int index;
        for(int i = 0;i< element.length;i + + ){<!-- -->
            if(element[i] != null){<!-- -->
                index = element[i].key.hashCode()% newArray.length;
                if(newArray[index]==null){<!-- -->
                    newArray[index] = element[i];
                }else {<!-- -->
                    for (int j = (index + 1) % newArray.length; j != index; j = ( + + j) % newArray.length) {<!-- -->
                        if (newArray[j] == null) {<!-- -->
                            newArray[j] = element[i];
                            break;
                        }
                    }
                }

            }
        }
        element = newArray;
    }
    @Override
    public void put(K key, V value) {<!-- -->
        //1. If the array is "full", expansion factor => 0.75 (cannot be completely full, high efficiency), expand and re-hash
        if((double)size/element.length >=0.75){<!-- -->
            Entry[] newElement = new Entry[element.length*2];
            rehash(newElement);
        }
        //2.index? If there is a hash conflict, search a circle starting from the current position and find an empty position to insert.
        int index = key.hashCode()% element.length;//Get the subscript of the inserted data
        if(element[index] == null){<!-- -->
            element[index] = new Entry<>(key,value);
        }else{<!-- --> //Element exists
            for(int i = (index + 1)%element.length;i != index;i=( + + i)% element.length){<!-- -->
                if(element[i]==null){<!-- -->
                    element[i] = new Entry<>(key,value);
                    break;
                }
            }
        }

        size + + ;

    }

    @Override
    public boolean containKey(K key) {<!-- -->
        for(int i = 0;i < element.length;i + + ){<!-- -->
            if(element[i] != null){<!-- -->
                if(element[i].key.compareTo(key)==0){<!-- -->
                    return true;
                }
            }
        }
        return false;
    }

    @Override
    public void remove(K key) {<!-- -->
        for(int i = 0;i < element.length;i + + ){<!-- -->
            if(element[i] != null){<!-- -->
                if(element[i].key.compareTo(key)==0){<!-- -->
                    element[i]=null;
                }
            }
        }

    }

    @Override
    public V get(K key) {<!-- -->
        for(int i = 0;i < element.length;i + + ){<!-- -->
            if(element[i] != null){<!-- -->
                if(element[i].key.compareTo(key)==0){<!-- -->
                    return element[i].value;
                }
            }
        }
        return null;
    }

    static class Entry<K,V>{<!-- -->
        private K key;
        private V value;

        public Entry(K key,V value){<!-- -->
            this.key = key;
            this.value = value;
        }
    }
}

3. Chain address method

Linear detection method: The more key-value pairs that produce hash conflicts, the more times the array will be traversed. In order to solve this problem, we use the combination of array and linked list. The array element is a linked list structure, resulting in a hash conflict, and the insertion operation of the linked list is performed under the same index subscript.

shortcoming:
① The disadvantage of the chain address method is that when there are more nodes that cause hash conflicts, the array index subscript search will become a linked list traversal operation. The advantage of arrays is that they randomly select values, and the time complexity reaches O(1), while the search time efficiency of linked lists reaches O(n).
② When array expansion occurs, all element nodes must be rehashed.

3.1 code implementation

Rehash:

import src15.Map;

public class MyLinkedHashMap<K extends Comparable<K>,V>implements Map<K,V> {<!-- -->
    private Node<K,V>[] element;//Node node type
    private int node_size;//Save the current number of nodes
    private int array_size;//valid number of arrays
    private static final double LOADFACTOR = 0.75;
    private static final int INITSIZE = 10;

    //structure
    public MyLinkedHashMap(){<!-- -->
        element = new Node[INITSIZE];
    }

    private void rehash(Node<K,V>[] array){<!-- -->
        array_size=0;
        for(int i = 0;i<element.length;i + + ){<!-- -->//Traverse the original array
            while(element[i] != null){<!-- -->
                Node<K,V> p = element[i];//p is used to delete nodes and insert heads into array
                element[i] = element[i].next;//p is independent from the current linked list
                int arrayIndex = p.value.key.hashCode()% array.length;//The subscript inserted into the new array
                if(array[arrayIndex]==null){<!-- -->
                    array_size + + ;
                }
                p.next = array[arrayIndex];
                array[arrayIndex] = p;//Insert the p header into the array
            }
        }
        element = array;
    }
    @Override
    public void put(K key, V value) {<!-- -->
        //map features: same keys, value overwriting
        Node.Entry<K,V> entryKV = containKey0(key);
        if(entryKV !=null){<!-- -->//Found, value overwrite
            entryKV.value=value;
            return;
        }

        //The key does not exist, insert operation
        if (1.0 * array_size / element.length >= LOADFACTOR) {<!-- -->//Expansion
            Node<K, V>[] newArray = new Node[element.length * 2];
            rehash(newArray);
        }

        //1. According to the key value, find the index -> insert the head of the singly linked list
        int index = key.hashCode()% element.length;
        if(element[index] == null){<!-- -->
            array_size + + ;
        }
        //2. Head insertion method
        Node.Entry<K,V> entry = new Node.Entry<>(key,value);
        Node<K,V> node = new Node<>(entry);
        node.next = element[index];//Head insertion, the next of the new node saves the original "head'
        element[index] = node; // Save the new starting position of the head
        node_size + + ;
    }

    private Node.Entry containKey0(K key){<!-- -->
        int index = key.hashCode()% element.length;
        for(Node<K,V> p = element[index];p!=null;p=p.next){<!-- -->
            if(p.value.key.compareTo(key) == 0){<!-- -->
                return p.value;
            }
        }
        return null;
    }
    @Override
    public boolean containKey(K key) {<!-- -->
        return containKey0(key) != null;
    }

    @Override
    public void remove(K key) {<!-- -->


    }

    @Override
    public V get(K key) {<!-- -->
        Node.Entry<K,V> entry = containKey0(key);
        if(entry == null){<!-- -->
            return null;
        }else{<!-- -->
            return entry.value;
        }
    }


    //Singly linked list node type
    static class Node<K,V>{<!-- -->
        private Entry<K,V> value;
        private Node<K,V> next;

        //structure
        public Node(Entry<K, V> value) {<!-- -->
            this.value = value;
        }

        //Key value pair
        static class Entry<K,V>{<!-- -->
            private K key;
            private V value;

            public Entry(K key, V value) {<!-- -->
                this.key = key;
                this.value = value;
            }
        }
    }


}

You have to study hard today~