Redis: AOF and RDB persistence

AOF and RDB

Redis persistence is to store redis data in the hard disk to prevent redis data from being lost after redis crashes or server restarts. There are two ways of redis persistence, the first is AOP (save operation); the second is RDB (save data).

AOF

?AOF (append only file) Persistence, using the form of a log to record each write operation, appended to the end of the AOF file.
?Redis does not enable AOF by default. Re-execute the commands in the AOF file to restore data when restarting

AOF problem

? AOF records the log after the command is executed. Why not record the log first and then execute the command? This is because Redis does not check the syntax of these commands first when recording logs to AOF. If the logs are recorded first and then the commands are executed, wrong commands may be recorded in the logs, which may cause errors when Redis uses the logs to restore data.
?It is precisely because the log is recorded after the command is executed, so the current write operation will not be blocked. But there are also two risks:

  • When the command is executed and the log has not been recorded, the data will be lost if the machine goes down
  • AOF will not block the current command, but may block the next operation.

AOF write-back strategy

?The best solution to these two risks is to compromise the three writeback strategies appendfsync of the AOF mechanism.

  • always, write back synchronously. After each subcommand is executed, the log will be written back to disk immediately.
  • Everysec, after each command is executed, the log is first written to the AOF memory buffer, and then the commands in the memory buffer are synchronized to the disk every second.
  • no: Just write the log to the AOF memory buffer first, and the operating system decides when to write to the disk.

The always synchronous writeback can basically guarantee that data will not be lost. The no strategy has high performance but data may be lost. Generally, you can choose everysec as a compromise.

AOF rewriting mechanism

?If more and more commands are accepted, the AOF file will also become larger and larger. If the file is too large, it will still cause performance problems. What should I do if the log file is too large? AOF rewrite mechanism! That is, as time goes by, AOF files will have some redundant commands such as: invalid commands, commands with expired data, etc. The AOF rewriting mechanism is to combine them into one command (similar to batch commands), so as to achieve streamlined compression space the goal of. For example: first set key, then del key, then these two commands can be compressed into one; or set key multiple times, only need to keep the last set key.
?Does AOF rewriting block? The AOF log is written by the main thread, but rewriting is different. The rewriting process is completed by the background subprocess bgrewriteaof.

AOF rewrite trigger timing:

  • Triggered when the BGREWRITEAOF command is executed to rewrite the AOF log
  • Triggered when the AOF log is enabled using the CONFIG SET command
  • Triggered when the AOF log exceeds a certain percentage of the base size (controlled by the parameter auto-aof-rewrite-percentage).

AOF Summary

  • Advantages: Higher data consistency and integrity, second-level data loss.
  • Disadvantages: For the same data set, the AOF file size is larger than the RDB file, and data recovery is slow.

RDB

?Because of the AOF persistence method, if there are a lot of operation logs, Redis recovery will be very slow.
?RDB (Redis DataBase), which saves memory data to disk in the form of snapshots. Compared with AOF, it records the data at a certain moment, not the operation.

What is a snapshot? It can be understood in this way, take a photo of the data at the current moment, and then save it.

RDB trigger mechanism

?RDB persistence refers to performing a specified number of write operations within a specified time interval, and writing the snapshot of the dataset in memory to the disk. It is the default persistence method of Redis. After the operation is completed, a dump.rdb file will be generated in the specified directory. When Redis restarts, the data will be restored by loading the dump.rdb file. RDB trigger mechanisms mainly include the following:

RDB problem

bgsave

Use bgsave to actively trigger persistence. The basave command will fork a child process. Although it can avoid blocking the main thread during the persistence process, the creation process will block the main thread.

Data modification during bgsave persistence

When persisting, can the data be modified? With the help of the operating system’s copy-on-write technology, Redis can process write operations normally while executing snapshots.

RDB Summary

  • Advantages: Compared with AOF, it is faster to restore large data sets, and it is suitable for large-scale data recovery scenarios, such as backup, full replication, etc.
  • Disadvantages: There is no way to achieve real-time persistence/second-level persistence.

How to choose AOF and RDB?

?AOF has to write back to the hard disk every time the command is executed, so the efficiency is relatively slow, but it can ensure that the data is basically not lost, and it records operations one by one, so the recovery of data will be slower.
?RDB backs up the current data, so the backup will be faster, and the data recovery will be faster, but the possibility of data loss is greater.

  • If the data cannot be lost, RDB and AOF are mixed. Redis 4.0 began to support the hybrid persistence of RDB and AOF, namely: use RDB for persistence, and use AOF for persistence between two RDBs (AOF uses the write-back strategy of everysec first). This method not only guarantees performance, but also ensures that data is basically not lost.
  • If it is only used as a cache and can withstand a few minutes of data loss, you can only use RDB, which has the highest performance.

Use the redis client to experiment

RDB Backup and Recovery

# View the storage path of dump.rdb
config get dir

# Execute the backup command
set key1 'test1'
bgsave

# Execute after the dump.rdb file is generated, test redis downtime rdb will lose data between 2 snapshot intervals
set key2 'test2'

Then after a while, you can see the generated dump.rdb file in the rdb storage path

# Then redis is forced to close, and when redis is restarted, it will automatically read dump.rdb to restore data when redis restarts
service redis-server stop
service redis-server restart

# Then in keys, it can be seen that only key1 is returned, and key2 is down and lost after bgsave
localhost:0>keys "*"
1) "key1"

AOF backup and restore

AOF persistence is turned off by default. To enable AOF, modify the redis config file, and then restart redis

After AOF is enabled, redis will generate an appendonly.aof file in the storage path, which records the operation of redis.

Open the AOF file and look at the contents inside. It is indeed the operation rather than the data.

# Execute the command, and then observe the modification time of appendonly.aof will change immediately
set key1 'test1'
# Execute the command after one minute, and then observe that the modification time of appendonly.aof will change immediately. Because the current write-back strategy is 'everysec', the command will be written back to the aof file every 1 second
set key2 'test2'

# Rename appendonly.aof to appendonly2.aof, and keep dump.rdb
# Force close redis, then reopen
appendonly.aof will be regenerated in the redis storage directory. It can be seen that when aof is only enabled, redis will not load dump.rdb when it starts

# Force close redis
# Rename appendonly2.aof to appendonly.aof
# reopen redis
# When redis starts aof, restarting will load appendonly2.aof to restore data by default
localhost:0>keys "*"
1) "key2"
2) "key1"

RDB and AOF mixed backup and recovery

?The hybrid persistence function of version 4.0 is disabled by default, and we can control the enabling of this function through the aof-use-rdb-preamble configuration parameter. Version 5.0 is enabled by default.
?It is important to note that to enable hybrid persistence, you must also set aof-use-rdb-preamble to yes and enable AOF persistence.

# When rewriting the AOF file, Redis is able to use an RDB preamble in the
# AOF file for faster rewrites and recoveries. When this option is turned
# on the rewritten AOF file is composed of two different stanzas:
#
# [RDB file][AOF tail]
#
# When loading Redis recognizes that the AOF file starts with the "REDIS"
# string and loads the prefixed RDB file, and continues loading the AOF
# tail.
aof-use-rdb-preamble yes

# The configuration file of redis said that after enabling the mixed use of AOF and RDB, the content format of the generated aof file is:
The first half is in RDB format
The second half is in AOF format

# You can use the BGREWRITEAOF command to manually trigger the rewriting of aof, and you can quickly see the changes in the aof file format.
BGREWRITEAOF

Then open the aof file