Redis persistence method RDB and AOF

Introduction to Persistence
- what is persistence
- why persist
- persistent application
Two persistence schemes for Redis
- Method 1: RDB (Redis DataBase)
- Method 2: AOF (Append Only File)
- Comparison of advantages and disadvantages of the two methods
- The default selection of Redis and the selection in the actual scene

Introduction to Persistence

What is persistence

transient state
For example, the data in memory cannot be stored permanently, such as power failure, server downtime, etc.
persistent state
For example, the data in the database can be stored for a long time. Even if the power is cut off or down, the data can still be found after restarting.

Persistence is the mechanism for converting program data between persistent state and transient state. In layman’s terms, it means that transient data is persisted to persistent data. Remaining unchanged within a certain period is persistence, and persistence is for time. The data in the database is persistent data, as long as you don’t delete or modify it.
Understanding persistence can be at two levels:

Application layer: Close the application and restart it, the previous data still exists
System layer: Shut down the system (computer) and then restart, the previous data still exists

Why persistence

Due to the data loss phenomenon caused by the defects of special circumstances, in order to prevent data loss, we need to do persistence at this time:
For example, the data will be lost after the memory is powered off, but there are some objects that cannot be lost anyway, such as bank account numbers. Unfortunately, people still cannot guarantee that the memory will never be powered off. Memory is too expensive. Compared with external storage such as hard disks, tapes, and CDs, the price of memory is 2 to 3 orders of magnitude higher, and the maintenance cost is also high. At least it needs to be powered all the time. Therefore, even if the object does not need to be stored permanently, it cannot always stay in the memory due to memory capacity limitations, and needs to be persisted to be cached in external memory.

Persistent applications

k8s: When we do system migration, we need to migrate the original services to K8S. The Mysql database used by the system must also be migrated to K8S. We know that K8S runs Pods one by one. K8S automatically manages Pods, and one Pod hangs up. , the other Pod will be pulled up immediately. If the Pod running Mysql hangs up and is pulled up again immediately, will the data stored in the original Pod still exist? Or will the newly pulled Pod perform data recovery? The answer is: NO! If there is no persistent storage, then brother, you have really gone from deleting the library to running away! From such a real scene, we should realize the importance of K8S persistent storage. It can be said that without persistence technology, K8S has no prospects for development!
Redis: Use a persistence mechanism to enhance data security. Simply put, it saves the data in the memory to the hard disk. For Redis, the persistence mechanism refers to saving the data in the memory as a hard disk file, so that when Redis restarts or the server fails, the data can be restored according to the persisted hard disk file.
This article mainly focuses on the persistence of Redis to further explain in detail

Two persistence schemes for Redis

For the specific configuration link of RDB and AOF documents, please click hereRedis RDB and AOF common configuration parameters

Method 1: RDB (Redis DataBase)

RDB is the snapshot mode, which is the default data persistence method of Redis. It will save the snapshot of the database in the binary file dump.rdb. Understand that the rdb file is a backup of memory data
Since data backup is required, IO reading and writing of files cannot be avoided. At this time, not only the Redis processing line is needed The above request also needs to do IO operations. At this time, Redis will call fork() through the operating system to create a child process, use the child process for data backup, and the parent process continues to process the client response. Regarding Rdb, we can regard it as a scheduled task, and check whether the persistence conditions are met every once in a while.
RDB trigger mechanism

save manually triggered
Command: save
new for old
The save command will block the Redis server process until the dump.rdb file is created. During this process, the server cannot process any command requests.
bgsave asynchronous
Command: bgsave
bgsave will fork() sub-threads to perform persistence asynchronously, and will not affect the response of the client

A comparison between save and bgsave

Command	save	bgsave
IO type	synchronous	asynchronous
blocking	yes	blocking occurred In fork
complexity	O(n)	O(n)
Advantages	No additional memory consumption	Will not block client commands
Disadvantages	Block client commands	Consume additional memory

automatic trigger
The following is a save configuration. This configuration means how many changes are made every time. If it is established, it will trigger persistence.

configuration	time	change
save	900	1
save	300	10
save	60	10000

Method 2: AOF (Append Only File)

AOF Whenever a command to modify the database is executed, the server will write the command to the appendonly.aof file, which stores all the modification commands executed by the server. Therefore, as long as the server re-executes the .aof file, it will be The purpose of restoring data can be achieved, and this process is vividly called “command replay”.

write mechanism
After Redis receives the modification command from the client, it first performs corresponding verification. If there is no problem, it immediately appends the command to the .aof file, that is, saves it to the disk first, and then the server executes the command. In this way, even if a sudden downtime occurs, you only need to perform a “command replay” of the commands stored in the .aof file to restore to the state before the downtime.
In the above execution process, a very important link is the writing of commands, which is an IO operation. In order to improve writing efficiency, Redis does not write the content directly to the disk, but puts it in a memory buffer (buffer), and waits until the buffer is full before actually writing the content in the buffer. into the disk.
rewrite mechanism
During the long-running process of Redis, the aof file will become longer and longer. If the machine is down and restarted, it will be very time-consuming to “replay” the entire aof file, resulting in Redis being unable to provide external services for a long time. Therefore, it is necessary to do a “slimming” exercise on the aof file. In order to control the size of the aof file within a reasonable range, Redis provides an AOF rewriting mechanism

If the AOF file is not rewritten, the AOF file will save four SADD commands. If AOF rewrite is used, only the following command will be kept in the AOF file

sadd animals "dog" "tiger" "panda" "lion" "cat"

A comparison of the advantages and disadvantages of the two methods

RDB	AOF
RDB file is a very concise single File, which saves Redis data at a certain point in time, is very suitable for backup. You can set a time point to archive the RDB file, so that you can easily restore the data to a different version when needed.	The AOF persistence method provides a variety of synchronization frequencies. Even if the default synchronization frequency is used to synchronize once per second, Redis will lose at most 1 second of data.
RDB is very suitable for disaster recovery. A single file can be easily transferred to a remote server.	The AOF file is constructed using the appended form of the Redis command, so even if Redis can only write command fragments to the AOF file, it is easy to correct the AOF file with the redis-check-aof tool
The performance of RDB is very good. When persistence is required, the main process will fork a child process, and then hand over the persistence work to the child process, and there will be no related I/ O operation.	The format of the AOF file is more readable, which also provides users with more flexible processing methods. For example, if we accidentally use the FLUSHALL command by mistake, we can manually remove the last FLUSHALL command before rewriting, and then use AOF to restore the data.

The default selection of Redis and the selection in the actual scene

1. If redis is only used as a cache server, we don’t need to use any persistence.

2. In general, we will enable both persistence methods. If there is both a dump.rdb file and an appendonly.aof file during data recovery, the data should be restored through appendonly.aof first, which can ensure data security to the greatest extent.

3. In the master-slave node, RDB is our backup data, and it is only started on the slave (slave node). The synchronization time can be set a little longer, and only the rule of (save 900 1) is enough.

4. When AOF is enabled, the master-slave synchronization will inevitably affect the performance of IO. At this time, we can increase the value of auto-aof-rewrite-min-size, such as 5GB. To reduce the frequency of IO

5. If AOF is not enabled, the performance impact of IO can be saved. This is the master-slave construction through RDB persistent synchronization, but if the master-slave hangs up, the impact will be greater