Redis RDB persistence

Foreword

We know that the reason why Redis is fast is largely because its data is placed directly in the memory, and the memory is a volatile memory. Data is stored only when the power is turned on, and the data will be lost when the power is turned off.
At this time, it depends on your application scenario. If you only use Redis as a cache for a relational database to speed up data access efficiency, then the Redis data will not be affected even if it is lost, and you can reload it from the relational database. . But if you use Redis directly as a database and store business data on it, then you need to focus on the persistence mechanism of Redis.
RDB is the abbreviation of Redis DataBase. It is the simplest persistence mechanism provided by Redis, also called memory snapshot. It will write all the data at a certain moment to the disk. After Redis restarts, it will load the RDB file to restore the data.
image.png

Trigger

First you need to configure the persistence file name, the default is dump.rdb

dbfilename dump.rdb

Then you can manually trigger RDB persistence through the commands save and bgsave. The difference between them is that the former will block the main thread, and the latter will fork a child process for asynchronous processing.

127.0.0.1:6379[1]> save
OK
127.0.0.1:6379[1]> bgsave
Background saving started

You can also configure it to actively trigger RDB persistence as many times as there are write changes within a certain period of time.

save 3600 1
save 300 100
save 60 10000

In addition, Redis will also trigger RDB persistence in other situations, such as normal shutdown of Redis, full data synchronization of replica nodes, etc.
It should be noted that RDB persistence comes at a cost and should not be triggered frequently. Take the save command as an example. It will block the main thread. Assume that the data is 4G, the disk write bandwidth is 100MB/s, and the RDB persistence blocking time is at least 40 seconds. During this period, Redis cannot process any other commands. This is obviously impossible. accepted. Secondly, even if bgsave asynchronously persists, the main process will be blocked when it forks the child process. The larger the amount of data, the longer the blocking time. During the persistence of the child process, if there are a large number of writes, it will lead to a large number of copy-on-write operations. Will seriously affect Redis performance.
image.png

RDB file format

The dumped RDB file consists of three parts:

  • File header: magic number, RDB version, Redis version, creation time and other information
  • Database data: all key-value pairs of each database
  • End of file: terminator, checksum

image.png
Redis will first write the file header information, which mainly includes the version number of Redis, the architecture information of Redis running, creation time, used memory size, etc.

Properties Example values Description
Magic REDIS0009 Magic number
redis-ver 6.2.13 Redis version number
redis-bits 64 Architecture information 32or64
ctime 1607211828 Creation time
used-mem 82718 Used memory size
aof-preamble 1 Whether to write the AOF leader

Then start writing all the key-value pair data in each database, which consists of two parts:

  • RDB_OPCODE_SELECTDB: SELECT-DB operation code, determines which database subsequent key-value pairs belong to
  • several key-value pairs

The key-value pair itself has some additional information, such as expiration time, LRU/LFU information, etc. Redis defines a batch of operation codes to identify this information. The information of the key-value pair is as follows:

Attributes Operation code Description
RDB_OPCODE_EXPIRETIME_MS 252 Expiration time (optional)
RDB_OPCODE_IDLE 248 LRU idle time (optional)
RDB_OPCODE_FREQ 249 LFU access frequency (optional)
ObjectType 0~7 Object Type
Key Length Key Length
Key Key value
Value Value value ( Different types of storage methods are different)

Finally, it is written to the end of the file, which contains two parts:

  • EOF: RDB end of file character
  • checksum: checksum to prevent file tampering/damage

Seeing is believing, let’s test it out and see what the RDB file looks like. First clear the database, then write a string:

127.0.0.1:6379[1]> flushall
OK
127.0.0.1:6379[1]> set name jackson
OK
127.0.0.1:6379[1]> save
OK

Then view the RDB file. Because it is binary, it cannot be viewed directly. Here, print out the corresponding ASCII code. You can see that the following are: magic number, Redis version number, Redis running architecture information, creation time, memory usage, AOF leading flag, and then key-value pair data.

od -A x -t x1c -v dump.rdb
0000000 52 45 44 49 53 30 30 30 39 fa 09 72 65 64 69 73
           R E D I S 0 0 0 9 372 \t r e d i s
0000010 2d 76 65 72 06 36 2e 32 2e 31 33 fa 0a 72 65 64
           - v e r 006 6 . 2 . 1 3 372 \
 r e d
0000020 69 73 2d 62 69 74 73 c0 40 fa 05 63 74 69 6d 65
           i s - b i t s 300 @ 372 005 c t i m e
0000030 c2 a3 05 22 65 fa 08 75 73 65 64 2d 6d 65 6d c2
           £ ** 005 " e 372 \b u s e d - m e m 302
0000040 30 00 11 00 fa 0c 61 6f 66 2d 70 72 65 61 6d 62
           0 \0 021 \0 372 \f a o f - p r e a m b
0000050 6c 65 c0 00 fe 01 fb 01 00 00 04 6e 61 6d 65 07
           l e 300 \0 376 001 373 001 \0 \0 004 n a m e \a
0000060 6a 61 63 6b 73 6f 6e ff a1 ce 0b 3e b5 94 1c 18
           j a c k so n 377 241 316 \v > 265 224 034 030
0000070

Source code

Redis defines a batch of operation codes under the src/rdb.h file to distinguish commands and attributes written to RDB files, such as: selecting a database, recording object expiration time, writing end characters, etc. wait.

#define RDB_OPCODE_MODULE_AUX 247 /* Module auxiliary data. */
#define RDB_OPCODE_IDLE 248 /* LRU idle time. */
#define RDB_OPCODE_FREQ 249 /* LFU frequency. */
#define RDB_OPCODE_AUX 250 /* RDB aux field. */
#define RDB_OPCODE_RESIZEDB 251 /* Hash table resize hint. */
#define RDB_OPCODE_EXPIRETIME_MS 252 /* Expire time in milliseconds. */
#define RDB_OPCODE_EXPIRETIME 253 /* Old expire time in seconds. */
#define RDB_OPCODE_SELECTDB 254 /* DB number of the following keys. */
#define RDB_OPCODE_EOF 255 /* End of the RDB file. */

A batch of object types are also defined for mapping Redis object types to RDB object types:

#define RDB_TYPE_STRING 0
#define RDB_TYPE_LIST 1
#define RDB_TYPE_SET 2
#define RDB_TYPE_ZSET 3
#define RDB_TYPE_HASH 4
#define RDB_TYPE_ZSET_2 5
#define RDB_TYPE_MODULE 6
#define RDB_TYPE_MODULE_2 7

The entry method for RDB persistence is rdbSave(), and the source code is in the src/rdb.c file.
Redis will first generate a temporary file based on the process ID, then start performing RDB persistence to write the temporary file, and finally replace the old file.

int rdbSave(char *filename, rdbSaveInfo *rsi) {<!-- -->
    char tmpfile[256];
    // Generate a temporary file named by process ID
    snprintf(tmpfile,256,"temp-%d.rdb", (int) getpid());
    fp = fopen(tmpfile,"w");
    // RDB persistent writing
    if (rdbSaveRio( & amp;rdb, & amp;error,RDBFLAGS_NONE,rsi) == C_ERR) {<!-- -->
        errno = error;
        goto werr;
    }
    // Flush to disk, close file
    if (ffflush(fp)) goto werr;
    if (fsync(fileno(fp))) goto werr;
    if (fclose(fp)) {<!-- --> fp = NULL; goto werr; }
    //Replace the old rdb file
    if (rename(tmpfile,filename) == -1) {<!-- -->
        return C_ERR;
    }
    return C_OK;
}

The core method of persistence is rdbSaveRio(). The main steps are:

  • Write file header information
  • Traverse the database
    • Write SELECTDB opcode, database number
    • Write the RESIZEDB opcode, set the global hash table and the hash table size of the expiration time Key
    • Traverse the hash table and write information about each key-value pair
  • Write EOF opcode and checksum
int rdbSaveRio(rio *rdb, int *error, int rdbflags, rdbSaveInfo *rsi) {<!-- -->
    // Hash table iterator, each database has a global hash table record key-value pair
    dictIterator *di = NULL;
    //Hash table node pointer, traverse key-value pairs
    dictEntry *de;
    // Magic number REDIS + 4-digit RDB version + end character
    char magic[10];
    //File checksum to prevent tampering
    uint64_t cksum;
    size_t processed = 0;
    int j;
    long key_count = 0;
    long long info_updated_time = 0;
    char *pname = (rdbflags & amp; RDBFLAGS_AOF_PREAMBLE) ? "AOF rewrite" : "RDB";
    if (server.rdb_checksum)
        rdb->update_cksum = rioGenericUpdateChecksum;
    // Generate a magic number and write it into the buffer
    snprintf(magic,sizeof(magic),"REDIS d",RDB_VERSION);
    //Write magic number to RDB file
    if (rdbWriteRaw(rdb,magic,9) == -1) goto werr;
    //Write other header information Redis version, creation time, memory size, etc.
    if (rdbSaveInfoAuxFields(rdb,rdbflags,rsi) == -1) goto werr;
    if (rdbSaveModulesAux(rdb, REDISMODULE_AUX_BEFORE_RDB) == -1) goto werr;
    // Traverse the database
    for (j = 0; j < server.dbnum; j + + ) {<!-- -->
        redisDb *db = server.db + j;
        // The global hash table of the database, that is, all key-value pairs
        dict *d = db->dict;
        if (dictSize(d) == 0) continue; // No data, skip
        // Hash table iterator
        di = dictGetSafeIterator(d);
        //Write the SELECTDB opcode first
        if (rdbSaveType(rdb,RDB_OPCODE_SELECTDB) == -1) goto werr;
        //Write the database number value again
        if (rdbSaveLen(rdb,j) == -1) goto werr;
        //Write the RESIZEDB operation code, the size of the global hash table and the expired Key hash table
        uint64_t db_size, expires_size;
        db_size = dictSize(db->dict);
        expires_size = dictSize(db->expires);
        if (rdbSaveType(rdb,RDB_OPCODE_RESIZEDB) == -1) goto werr;
        if (rdbSaveLen(rdb,db_size) == -1) goto werr;
        if (rdbSaveLen(rdb,expires_size) == -1) goto werr;
        // Traverse the hash table and write each key-value pair
        while((de = dictNext(di)) != NULL) {<!-- -->
            // Take out Key and Value
            sds keystr = dictGetKey(de);
            robj key, *o = dictGetVal(de);
            long long expire;
            initStaticStringObject(key,keystr);
            expire = getExpire(db, & amp;key);
            //Write key-value pairs
            if (rdbSaveKeyValuePair(rdb, & amp;key,o,expire) == -1) goto werr;
            if (rdbflags & amp; RDBFLAGS_AOF_PREAMBLE & amp; & amp;
                rdb->processed_bytes > processed + AOF_READ_DIFF_INTERVAL_BYTES)
            {<!-- -->
                processed = rdb->processed_bytes;
                aofReadDiffFromParent();
            }
            if ((key_count + + & amp; 1023) == 0) {<!-- -->
                long long now = mstime();
                if (now - info_updated_time >= 1000) {<!-- -->
                    sendChildInfo(CHILD_INFO_TYPE_CURRENT_INFO, key_count, pname);
                    info_updated_time = now;
                }
            }
        }
        dictReleaseIterator(di);
        di = NULL; /* So that we don't release it again on error. */
    }
    if (rsi & amp; & amp; dictSize(server.lua_scripts)) {<!-- -->
        di = dictGetIterator(server.lua_scripts);
        while((de = dictNext(di)) != NULL) {<!-- -->
            robj *body = dictGetVal(de);
            if (rdbSaveAuxField(rdb,"lua",3,body->ptr,sdslen(body->ptr)) == -1)
                goto werr;
        }
        dictReleaseIterator(di);
        di = NULL;
    }
    if (rdbSaveModulesAux(rdb, REDISMODULE_AUX_AFTER_RDB) == -1) goto werr;
    //Write EOF opcode 0xff
    if (rdbSaveType(rdb,RDB_OPCODE_EOF) == -1) goto werr;
    //Write checksum
    cksum = rdb->cksum;
    memrev64ifbe( & amp;cksum);
    if (rioWrite(rdb, & amp;cksum,8) == 0) goto werr;
    return C_OK;
werr:
    if (error) *error = errno;
    if (di) dictReleaseIterator(di);
    return C_ERR;
}

The method of writing key-value pairs is rdbSaveKeyValuePair(). The main steps are:

  • Write expiration time opcode and value (optional)
  • Write LRU opcode and idle time (optional)
  • Write LFU opcode and access frequency information (optional)
  • Write the type, Key value, and Value value of the key-value pair in sequence.
int rdbSaveKeyValuePair(rio *rdb, robj *key, robj *val, long long expiretime) {<!-- -->
    int savelru = server.maxmemory_policy & MAXMEMORY_FLAG_LRU;
    int savelfu = server.maxmemory_policy & MAXMEMORY_FLAG_LFU;

    //Write expiration time
    if (expiretime != -1) {<!-- -->
        if (rdbSaveType(rdb,RDB_OPCODE_EXPIRETIME_MS) == -1) return -1;
        if (rdbSaveMillisecondTime(rdb,expiretime) == -1) return -1;
    }

    //Write LRU opcode and idle time
    if (savellru) {<!-- -->
        uint64_t idletime = estimateObjectIdleTime(val);
        idletime /= 1000; /* Using seconds is enough and requires less space.*/
        if (rdbSaveType(rdb,RDB_OPCODE_IDLE) == -1) return -1;
        if (rdbSaveLen(rdb,idletime) == -1) return -1;
    }

    //Write LFU operation code and access frequency information
    if (savelfu) {<!-- -->
        uint8_t buf[1];
        buf[0] = LFUDecrAndReturn(val);
        if (rdbSaveType(rdb,RDB_OPCODE_FREQ) == -1) return -1;
        if (rdbWriteRaw(rdb,buf,1) == -1) return -1;
    }
    //Write the key-value pair type, Key value, and Value value in sequence
    if (rdbSaveObjectType(rdb,val) == -1) return -1;
    if (rdbSaveStringObject(rdb,key) == -1) return -1;
    if (rdbSaveObject(rdb,val,key) == -1) return -1;
    if (server.rdb_key_save_delay)
        debugDelay(server.rdb_key_save_delay);
    return 1;
}

rdbSaveObjectType() is used to write the object type. Redis will map the RedisObject type to the RDB object type, which corresponds to a number. Then start writing the key-value pair. Because the Key is a string, the rdbSaveStringObject() method will be called to write. Redis will also determine whether the Key can be encoded in integer type. If so, it will be written directly into the integer. , otherwise write a string.

ssize_t rdbSaveStringObject(rio *rdb, robj *obj) {<!-- -->
    // Prioritize trying to write in integer encoding
    if (obj->encoding == OBJ_ENCODING_INT) {<!-- -->
        return rdbSaveLongLongAsStringObject(rdb,(long)obj->ptr);
    } else {<!-- --> // Write string
        serverAssertWithInfo(NULL,obj,sdsEncodedObject(obj));
        return rdbSaveRawString(rdb,obj->ptr,sdslen(obj->ptr));
    }
}

Value will be written in different serialization methods according to the object type. The method is rdbSaveObject(). The code will not be posted here.

Tail

RDB persistence can write all the key-value pairs of the Redis database at a certain moment to the disk file. Because it is in binary format, the recovery speed is very fast, which is very suitable for data backup and master-slave replication scenarios. The dumped RDB file consists of a file header, database data, and file tail. The file header mainly records the Redis version, running architecture information, etc.; then Redis will traverse the database, first write the SELECT-DB operation code, and then traverse the hash table Write all key-value pairs; finally write the terminator and checksum. The main function of the checksum is to ensure that the RDB file has not been tampered with or damaged.