Redis key-value pair mapping relationship storage-Dict

Basic overview

Redis is a key-value (Key-Value Pair) database, which can quickly add, delete, modify and query according to the key. And The mapping relationship between key and value is realized through Dict.

Dict consists of three parts: Hash Table (DictHashTable), Hash Node (DictEntry), Dictionary (Dict)

Hash table:

image-20230613214446162

Hash node:

image-20230613214626071

size can only be 2^n

The sizemark must be 2^n – 1 to have the following effect

And operation with sizemark In fact, the effect of remainder with size is the same (hash operation)

When adding a key-value pair to Dict, Redis first calculates the hash value (h) according to the key, and then uses h & amp; sizemask to calculate which index position the element should be stored in the array.

For example: store k1=v1, assuming the hash value h of k1 = 1, then 1 & amp; 3 = 1, so k1 = v1 should be stored in the subscript 1 position of the array.

image-20230613215003753

(size default size is 4)

Assuming that the hash value of k2 is also 1, nodes with the same hash value, zipper method (add to the beginning of the linked list)

image-20230613215403279

dictionary:

image-20230613215647926

image-20230613215828465

Dict expansion

HashTable in Dict is the realization of an array combined with a one-way linked list. When there are many elements in the collection, it will inevitably lead to more hash conflicts. If the linked list is too long, the query efficiency will be greatly reduced.

Dict will check the load factor (LoadFactor = used/size) every time a new key-value pair is added, and the hash table will be triggered when the following two conditions are met Expansion:

  • The LoadFactor of the hash table is >= 1, and the server does not execute background processes such as BGSAVE or BGREWRITEAOF (consumption of CPU, the load factor is not very large, and can be tolerated);
  • The LoadFactor of the hash table > 5 (the load factor is too large and unbearable);

image-20230613220428152

Contraction of Dict

In addition to capacity expansion, Dict also checks the load factor every time an element is deleted. When LoadFactor < 0.1, shrinks the hash table:

image-20230613221027506

image-20230613221133386

image-20230613221331037

Dict rehash

Whether it is expansion or contraction, a new hash table must be created, resulting in changes in the size and sizemask of the hash table, and the query of the key is related to the sizemask. Therefore, it is necessary to recalculate the index for each key in the hash table and insert a new hash table. This process is called rehash.

The process is like this:

① Calculate the realeSize of the new hash table, the value depends on whether the current expansion or contraction is to be done:

  • If it is expansion, the new size is the first 2^n greater than or equal to dict.ht[0].used + 1
  • If it is contraction, the new size is the first 2^n greater than or equal to dict.ht[0].used (not less than 4)

② Apply for memory space according to the new realeSize, create dicttht, and assign it to dict.ht[1]

③ Set dict.rehashidx = 0 to mark the start of rehash

④ rehash each dictEntry in dict.ht[0] to dict.ht[1]

⑤ Assign dict.ht[1] to dict.ht[0], initialize dict.ht[1] as an empty hash table, and release the original memory of dict.ht[0]

Whether it is expansion or contraction, dictExpand() will be called, and finally _dictExpand() will be called

/* Expand or create the hash table,
 * when malloc_failed is non-NULL, it'll avoid panic if malloc fails (in which case it'll be set to 1).
 * Returns DICT_OK if expand was performed, and DICT_ERR if skipped. */
int _dictExpand(dict *d, unsigned long size, int* malloc_failed)
{<!-- -->
    if (malloc_failed) *malloc_failed = 0;

    // If rehash is in progress, or the number of dict nodes is greater than the number of extensions, return an error directly
    if (dictIsRehashing(d) || d->ht[0].used > size)
        return DICT_ERR;

    // create a new hash table
    dict n; /* the new hash table */
    unsigned long realsize = _dictNextPower(size); // Calculate the number of 2^n extensions that satisfy the condition

    // Robustness judgment
    if (realsize < size || realsize * sizeof(dictEntry*) < realsize)
        return DICT_ERR;

    if (realsize == d->ht[0].size) return DICT_ERR;

    // New hash table assignment
    n.size = realsize;
    n.sizemask = realsize-1;
    if (malloc_failed) {<!-- -->
        n.table = ztrycalloc(realsize*sizeof(dictEntry*));
        *malloc_failed = n.table == NULL;
        if (*malloc_failed)
            return DICT_ERR;
    } else
        n.table = zcalloc(realsize*sizeof(dictEntry*));

    n.used = 0;

    /* Is this the first initialization? If so it's not really a rehashing
     * we just set the first hash table so that it can accept keys. */
    // Initialize for the first time, without rehash, directly initialize the first hash table, and end directly
    if (d->ht[0].table == NULL) {<!-- -->
        d->ht[0] = n;
        return DICT_OK;
    }

    /* Prepare a second hash table for incremental rehashing */
    // Not the first initialization, prepare the hash table required for the second rehash
    d->ht[1] = n;
    // Set to 0, the logo starts rehash
    d->rehashidx = 0;
    return DICT_OK;
}

Actually, the rehash process of redis does not rehash to dict.ht[1] node by node. Assuming that the number of nodes is tens of thousands, this process is time-consuming and not particularly efficient

Dict progressive rehash

The rehash of Dict is not done in one go. Just imagine, if the Dict contains millions of entries, it needs to be completed in one rehash, which is very likely to cause the main thread to block. Therefore, the rehash of Dict is completed in multiple steps and gradually, so it is called progressive rehash.

The actual complete process is as follows:

① Calculate the size of the new hash table, the value depends on whether the current expansion or contraction is to be done:

  • If it is expansion, the new size is the first 2^n greater than or equal to dict.ht[0].used + 1
  • If it is contraction, the new size is the first 2^n greater than or equal to dict.ht[0].used (not less than 4)

② Apply for memory space according to the new size, create dicttht, and assign it to dict.ht[1]

③ Set dict.rehashidx = 0 to mark the start of rehash

Rehash each dictEntry in dict.ht[0] to dict.ht[1]

Every time you perform a new, query, modify, or delete operation, check whether dict.rehashidx is greater than -1, and if so, rehash the entry list of dict.ht[0].table[rehashidx] to dict .ht[1], and will rehashidx++. All data up to dict.ht[0] are rehash to dict.ht[1]

⑤ Assign dict.ht[1] to dict.ht[0], initialize dict.ht[1] as an empty hash table, and release the memory of the original dict.ht[0] ), rehash only operates on the elements on one corner of the array until all elements are migrated, and resets two dicttht]

⑥ Assign rehashidx to -1, which means the end of rehash

In the rehash process, if you add a new operation, it will be directly written into ht[1], and query, modification and deletion will be searched and executed in dict.ht[0] and dict.ht[1] in turn. This can ensure that the data of ht[0] will only decrease but not increase, and will eventually be empty with rehash

Summary

Dict structure:

  • Similar to java’s HashTable, the bottom layer is array plus linked list to resolve hash conflicts
  • Dict contains two hash tables, ht[0] is usually used, ht[1] is used for rehash

Stretching of Dict:

  • When LoadFactor is greater than 5 or LoadFactor is greater than 1 and there is no child process task, Dict expansion
  • Dict shrinks when LoadFactor is less than 0.1 and hash table size is greater than initial value 4
  • The expansion size is the first 2^n greater than or equal to used + 1
  • The shrinkage size is the first 2^n greater than or equal to used
  • Dict adopts progressive rehash, executes rehash once every time Dict is accessed, until all elements are rehashed
  • When rehash, ht[0] only decreases but not increases, new operations are only performed on ht[1], and other operations are performed on the two hash tables