Three persistence strategies and selection suggestions for Redis

Article directory

  • Three persistence strategies and selection suggestions for Redis
    • foreword
    • RDB (snapshot)
      • overview
      • Advantages and disadvantages
    • AOF (append to file)
      • overview
      • Advantages and disadvantages
      • AOF brush strategy
      • AOF rewrite
    • Choose the right persistence strategy
      • Choice of AOF and RDB
      • Mixed mode of AOF and RDB
      • Conflict between AOF rewriting and RDB persistence
      • AOF verification mechanism
      • Suggestions for the selection of the three modes
    • Persistence strategy common problems and solutions
      • AOF file is too large
      • AOF file is corrupted
      • AOF files may be truncated
      • RDB file missing
      • RDB file corruption
  • Summarize
  • Series Article Directory

Three persistence strategies and selection suggestions for Redis

Foreword

Redis is a memory-based high-performance key-value database that supports three different persistence strategies: RDB (snapshot), AOF (append file), and hybrid. These three strategies have their own advantages and disadvantages, and need to be selected and configured according to different scenarios and requirements. This article describes the three strategies

RDB (snapshot)

Overview

The RDB persistence strategy refers to storing the data in Redis memory as a binary file within a certain time interval format saved to the hard disk. This binary file is a snapshot, which records all the data in Redis memory at a certain moment. RDB The persistence strategy can be triggered by configuration files or commands. Multiple conditions can be set in the configuration file. When any condition is met, a snapshot operation will be executed. As follows:

save 900 1 # If a set operation is executed within 900 seconds, it will be persisted once
save 300 10 # Execute 10 set operations within 300 seconds, then persist once
save 60 10000 # Execute 10000 set operations within 60 seconds, then persist once

There are two types of commands:

  • save: Not recommended, it will block the redis service process until the RDB file is successfully created
  • bgsave: The parent process creates a child process to generate RDB files, the parent process can process the client’s instructions normally, does not affect the service of the main process

Pros and cons

The advantages of the RDB persistence strategy are:

  • RDB file is a compact binary file with small footprint and fast transmission speed, suitable for backup and disaster recovery
  • RDB files restore data faster than AOF, because only one file needs to be loaded
  • RDB persistence has less impact on the performance of the Redis server, because most of the work is done by child processes

The disadvantages of the RDB persistence strategy are:

  • RDB files cannot reflect the data in Redis memory in real time or near real time, because it is triggered periodically. If a failure occurs between snapshots, some data may be lost
  • The RDB file may occupy more memory and CPU resources during the generation process, because the memory of the main process needs to be copied and the compression operation is performed

AOF (append file)

Overview

The AOF persistence strategy refers to recording every write command executed by the Redis server into a text file, which is an append only file)

AOF has three kinds of persistence strategies, which are disc scrubbing strategies. You can use different disk flashing strategies according to different scenarios.

However, as time goes by, the AOF file will become larger and larger, because it records all write commands. This will cause the AOF file to occupy too much disk space and take too long to restore data. To solve this problem, Redis provides AOF rewriting mechanism to compress and optimize AOF files.

Pros and cons

The advantages of the AOF persistence strategy are:

  • The AOF file can record the data in the Redis memory in real time or near real time, because it is synchronized every write command or every second. If a failure occurs between synchronizations, some data may be lost, but the probability of data loss is smaller than that of RDB.
  • An AOF file is a text file that can be easily viewed and edited. The commands in the AOF file are in the Redis protocol format and can be executed directly with the Redis client.
  • AOF files can be automatically rewritten to reduce redundant commands and file size. The rewriting process does not affect the normal service of the Redis server and will not lose any data.

The disadvantages of the AOF persistence strategy are:

  • AOF files are usually larger than RDB files and take up more disk space
  • AOF files restore data slower than RDB because all commands need to be re-executed
  • AOF files may have data inconsistencies during the writing process, for example, only half of the commands are written or wrong commands are written. In this case, you need to use the redis-check-aof tool to repair the AOF file

AOF brush strategy

When Redis restarts, the data can be recovered by re-executing the commands in the appended file. The AOF persistence strategy can be enabled and set through the configuration file, which determines the frequency of writing commands to the AOF file. There are three options:

  • no: write cache, when to flush is determined by redis
  • everysec: refresh the disk every second
  • always: write to the disk at the same time as writing the cache (flash as soon as possible, not in real time)

Here is a comparison of the three strategies:

Type Data Security Performance
no Low High
everysec higher higher
always high low

AOF rewrite

The principle of the AOF rewriting mechanism is: Redis will create a new AOF file, and then generate corresponding write commands according to the current data state in memory, and write them into the new AOF file. In this way, the new AOF file only includes the write command of the final data, and does not include any invalid or redundant commands. For example:

# Original AOF file
set a 1
set b 2
incr a
del b
set c 3

# The rewritten AOF file
set a 2
set c 3

The picture above is the comparison of the files before rewriting and after rewriting, because AOF is appended and reads and writes sequentially (ES is also like this), so the command set a 1 after rewriting is the same as incr a becomes set a 2. In order to ensure that new data is not lost during AOF rewriting, AOF rewrite buffer is introduced in Redis. After starting to execute AOF file rewriting and receiving a request command from the client, not only the command must be written into the original AOF buffer (flash disk according to the parameters mentioned above ), and write to the AOF rewrite buffer at the same time:

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-u9lEQqp9-1684733193342) (https://secure2.wostatic.cn/static/sH2Ncnf2Vc3WRoQQWrk8Q/redis_aof_rewrite.png?auth_key =1684286048-feAWJs15rwF6vcevzJg77u-0-b5ee64f632fc1b396eb189fd041deb46)]

Once the child process has finished rewriting the AOF file, it will send a signal to the parent process, and the parent process will block after receiving the signal (do not execute any commands during the blocking period), and perform the following two Work:

  • Flush the file in the AOF rewrite buffer to a new AOF file
  • Rename the new AOF file and atomically replace the old AOF file

Then, after completing the above two tasks, the entire AOF rewriting work is completed, and the parent process begins to receive commands normally.

  • Auto trigger: Auto trigger can be set through the following parameters.
# The percentage of the file whose size exceeds the last AOF rewrite. default 100
# That is, the AOF rewrite will be triggered again after the default reaches 2 times of the last AOF rewrite file
auto-aof-rewrite-percentage 100
# Set the minimum AOF file size allowed to be rewritten, the default is 64M
# Mainly to avoid meeting the above percentage, but the file is still very small.
auto-aof-rewrite-min-size 64mb
  • Manual trigger: Execute the bgrewriteaof command.

Choose the correct persistence strategy

There are three existing persistence strategies for Redis:

  • AOF
  • RDB
  • AOF mixed with RDB

They have their own advantages and disadvantages, and they need to be considered in combination with different application scenarios. First, explain the choice of AOF and RDB, and then explain the hybrid mode

Selection of AOF and RDB

In Redis, the two persistence methods of AOF and RDB have their own advantages and disadvantages. Generally speaking, the following aspects need to be referred to:

  • Data security: If data is required not to be lost, AOF is recommended
    • AOF can adopt synchronize once per second data or synchronize every write operation to ensure data security
      • If the sync every second policy is used, at most one second of data will be lost
      • If you use the strategy of Synchronize every write operation, the security reaches the extreme, but this will affect performance
    • RDB is a full binary file, which only needs to be loaded into the memory when restoring, but the data of the last few minutes may be lost (depending on the RDB persistence strategy)
  • Data recovery speed: If fast data recovery is required, RDB is recommended
    • AOF needs to re-execute all write commands, and the recovery time will be longer
    • RDB is a full binary file, which only needs to be loaded into memory when restoring
  • Data backup and migration: If you require convenient data backup and migration, RDB is recommended
    • AOF files can be large and slow to transfer
    • The RDB file is a compact binary file that takes up little space and transfers quickly
  • Data readability: AOF is recommended if you need to be able to view and modify data easily
    • AOF is a readable text file that records all write commands and can be used for disaster recovery or data analysis
    • RDB is a binary file, not easy to view and modify
Data Security Data Recovery Speed Data Backup and Migration Data readability
AOF high low low High
RDB Low High High Low

Mixed mode of AOF and RDB

Based on the previous section, we can choose the appropriate persistence method according to different scenarios and needs. However, in practical applications, it is not necessary to choose one of the two, and you can also use both AOF and RDB persistence methods. In this way, AOF can be used to ensure that data is not lost, as the first choice for data recovery; use RDB to do different degrees of cold backup, when the AOF backup file is lost or damaged, you can use RDB snapshot files to quickly restore data

To sum up, the hybrid mode combines the fast recovery capability after RDB restart and the low risk of AOF data loss. The specific operation process is as follows:

  1. The child process will be written into AOF by BGSAVE
  2. After BGREWRITEAOF is triggered, AOF will be written to the file
  3. Overwrite the old AOF file with the data containing RDB and AOF (at this time, half of the AOF file is RDB and half is AOF)

AOF file for mixed mode:

REDIS0008?redis-ver4.0.1?redis-bits around?ctime 聮~`?used-memaof-preamblerepl-id(6c3378899b63bc4ebeaafaa09c27902d514eeb1f?repl-offset? e k1v1 Yi Hip S[zb*2
$6
SELECT
$1
0
*3
$4
sadd
$8
game disk
$4
nioh
*3
$4
sadd
$8
game disk
$4
tomb

If you want to enable mixed mode, configure it in redis.conf:

aof-use-rdb-preamble yes

At the same time, there are some issues that need to be paid attention to when using the two persistence methods of AOF and RDB:

  • AOF rewriting and RDB persistence may conflict at the same time, resulting in increased memory, CPU, and disk consumption. In order to solve this problem, Redis adopts some strategies to coordinate the relationship between the two. For details, please refer to the following introduction (Conflict between AOF rewriting and RDB persistence)
  • AOF files can become very large, resulting in insufficient disk space or long recovery times. To solve this problem, Redis provides an AOF rewriting mechanism to compress AOF files. For details, please refer to the previous section (AOF rewriting)
  • AOF files may be damaged or lost, resulting in unrecoverable data. In order to solve this problem, Redis provides an AOF verification mechanism to detect whether the AOF file is complete. For details, please refer to the following introduction (AOF verification mechanism)

Conflict between AOF rewriting and RDB persistence

In Redis, AOF rewriting and RDB persistence may happen at the same time, which will cause some conflicts and problems. For example:

  • Both AOF rewriting and RDB persistence require fork child processes. If two child processes exist at the same time, memory consumption and system load will increase.
  • Both AOF rewriting and RDB persistence need to be written to disk. If two files are written at the same time, it will increase disk pressure and IO overhead.
  • Both AOF rewriting and RDB persistence need to notify the main process after completion. If the two signals arrive at the same time, it may cause signal loss or processing errors.

To resolve these conflicts and issues, Redis employs the following strategies:

  • If AOF rewriting and RDB persistence are triggered at the same time, only one child process will be created, and RDB persistence will be executed first, and then AOF rewriting will be executed. This avoids the situation where two child processes exist at the same time.
  • If AOF rewriting is in progress and a request for RDB persistence is received at this time, RDB persistence will be delayed until AOF rewriting is completed. This avoids the situation where two files are written at the same time.
  • If both AOF rewriting and RDB persistence are completed, the main process will first process the signal of RDB persistence, and then process the signal of AOF rewriting. This avoids situations where signals are lost or handled incorrectly.

In short, Redis coordinates conflicts and problems between AOF rewriting and RDB persistence through priority, delay, and order to ensure data integrity and consistency. The following figure is a brief description.

Scenario Strategy
AOF rewriting and RDB persistence at the same time Triggered Priority RDB
AOF rewrite in progress Priority AOF
AOF rewriting and RDB persistence are completed Priority RDB

AOF verification mechanism

The AOF verification mechanism refers to checking the AOF file when Redis starts to determine whether the file is complete and whether there is any damaged or lost data. If there is a problem with the AOF file, Redis will refuse to start and give a corresponding error message

The principle of the AOF verification mechanism is to use a 64-bit checksum (checksum) to verify the AOF file. The checksum is a number, it is calculated based on the content of the AOF file, if any changes occur in the content of the AOF file, then the checksum will also change. Therefore, by comparing the calculated checksum with the checksum stored at the end of the AOF file, it can be judged whether the AOF file is complete.

Specifically, the process of the AOF verification mechanism is as follows:

  • When Redis performs AOF rewriting, it will write a special command at the end of the new AOF file: *1\r\
    $6\r\
    CHECKSUM\r\
    , this command means to receive Down to write a checksum
  • Redis will use the CRC64 algorithm to calculate all the content in the new AOF file except the last line, get a 64-bit number as a checksum, and write this number to the new AOF in hexadecimal end of file.
  • Redis will replace the old AOF file with the new AOF file and save the checksum in memory
  • When Redis restarts, it will read the AOF file, and use the same CRC64 algorithm to calculate everything except the last line, get a 64-bit number as a checksum, and save this number with the memory checksum for comparison
  • If the two checksums are the same, it means that the AOF file is not damaged or lost data, and Redis will continue to start and load the data in the AOF file
  • If the two checksums are different, it means that there is a problem with the AOF file, and Redis will refuse to start and give an error message similar to Bad file format reading the append only file: checksum mismatch

In this way, Redis can guarantee to detect whether the AOF file is complete at startup, so as to avoid loading errors or incomplete data. Of course, this mechanism has some limitations:

  • The AOF verification mechanism can only be executed when Redis starts. If the AOF file is modified or damaged during operation, Rediscannot find it in time.
  • The AOF verification mechanism can only detect whether the AOF file is complete, but cannot detect whether the AOF file is correct. For example, if someone maliciously modifies some commands or parameters in the AOF file, resulting in data logic errors, then Redis cannot recognize this situation.
  • The AOF verification mechanism will increase the time overhead when Redis starts, because the entire AOF file needs to be calculated. This process can be slow if the AOF file is large.

In short, the AOF verification mechanism is a simple and effective method to ensure that the integrity of the AOF file is detected when Redis starts. But it also has some limitations and costs, which need to be weighed in practical applications.

Suggestions for choosing the three modes

Specific selection suggestions are as follows:

  • If the data integrity requirements are not high, you can only use RDB, or set the synchronization frequency of AOF to once per second
  • If you want to avoid data loss as much as possible, you can only use AOF, and set the synchronization frequency of AOF to be synchronized for every write operation
  • If both data integrity and performance are required, you can use AOF and RDB at the same time, and set the synchronization frequency of AOF to once per second. This can not only ensure data security, but also use RDB for fast data recovery
  • If you want to save disk space and improve data recovery speed, you can only use RDB and adjust the snapshot frequency of RDB appropriately

The two persistence methods of AOF and RDB have their own advantages and disadvantages, and need to be selected and configured according to specific scenarios and requirements. When choosing, need to consider the following factors:

  • Data Integrity: the risk and acceptable range of data loss
  • Data recovery speed: the time required to restore from persistent files to memory
  • Disk space occupied: the disk space occupied by persistent files
  • Write performance: the impact of persistence operations on the write performance of the Redis server

Notice:
When the AOF policy is set to always or everysec, and BGSAVE or BGREWRITEAOF are doing a lot of I/O to the disk, Redis flushing may block
You can set no-appendfsync-on-rewrite yes to alleviate this problem. This way, Redis has the same persistence as appendfsync no while another child process is saving. In fact, the worst case is losing 30 seconds of logs

Common problems and solutions of persistence strategy

AOF file is too large

When the AOF file is too large, it will occupy disk space, affect writing performance, and even cause Redis startup failure. You can use the bgrewriteaof command or configure the auto-aof-rewrite-percentage and auto-aof-rewrite-min-size parameters to trigger AOF rewriting operation, compressing the AOF file into a minimal set of commands

# The percentage of the file whose size exceeds the last AOF rewrite. default 100
# That is, the AOF rewrite will be triggered again after the default reaches 2 times of the last AOF rewrite file
auto-aof-rewrite-percentage 100
# Set the minimum AOF file size allowed to be rewritten, the default is 64M
# Mainly to avoid meeting the above percentage, but the file is still very small.
auto-aof-rewrite-min-size 64mb

AOF file is damaged

When the AOF file is damaged, Redis will not be able to start normally or restore data. You can use the redis-check-aof tool to repair the AOF file, or use the backup RDB file to restore the data

AOF files may be truncated

During Redis startup, when the AOF data is loaded back into memory, it may be found that the AOF file is truncated at the end

  • aof-load-truncated yes, then load the truncated AOF file and record the log
  • aof-load-truncated no, the server will refuse to start due to an error, and you need to use redis-check-aof to repair the aof file before starting the server

It can be configured in redis.conf:

aof-load-truncated yes

Time stamps can be recorded to help restore data

If the timestamp is recorded in AOF, it may be incompatible with the existing AOF parser, which is disabled by default

Configuration in redis.conf:

aof-timestamp-enabled no

RDB file missing

When the RDB file is lost, Redis cannot recover the data. To solve this problem, you can use the backup AOF file or RDB file of other nodes to restore data, or increase the snapshot frequency of RDB to reduce the risk of data loss

RDB file is damaged

When the RDB file is damaged, Redis cannot recover the data. To solve this problem, you can use the redis-check-rdb tool to check and repair the RDB file, or use the backup AOF file or the RDB file of other nodes to restore the data

Summary

This article introduces four special data types in Redis: Hyperloglog, GEO, Bitmap, Bitfield

  • Hyperloglog is an algorithm for estimating the cardinality. It can use a small amount of memory to count the number of a large number of unique elements. It is suitable for scenarios such as UV statistics and the number of online users.
  • GEO is a data type used to store geographic coordinates and calculate distances. It can be used to implement functions such as nearby people and places
  • Bitmap is a data type that uses a bit to represent the value or state corresponding to an element. It can be used to realize the user’s online status, check-in function, etc.
  • Bitfield is a command used to operate on bits in a string, which can be used to implement functions such as counters and Bloom filters

These special data types demonstrate the power and flexibility of Redis, providing developers with more possibilities

Series article directory

Redis memory optimization – Introduction to String type and detailed explanation of underlying principles
Redis memory optimization – introduction to Hash type and detailed explanation of underlying principles
Redis Memory Optimization – Introduction to List Type and Detailed Explanation of Underlying Principles
Redis memory optimization – Introduction to Set type and detailed explanation of underlying principles
Redis memory optimization – introduction of ZSet type and detailed explanation of underlying principles
Redis Memory Optimization – Introduction to Stream Type and Detailed Explanation of Underlying Principles
Redis memory optimization – detailed explanation of Hyperloglog, GEO, Bitmap, Bitfield types
Three persistence strategies and selection suggestions for Redis