“Understand in one article” Redis master-slave synchronization mechanism

Contents of this chapter

Main functions

The main functions of Redis master-slave synchronization are as follows:

Data redundancy: Hot backup of data is achieved through master-slave synchronization, which is a data redundancy method in addition to persistence.
Failure recovery: When a problem occurs on the master node, the slave node can continue to provide services to achieve rapid recovery from the failure.
Load balancing: Based on master-slave synchronization, combined with read-write separation, the master node provides write services and the slave nodes provide read services to share the server load. In a scenario of less writing and more reading, sharing the read load through multiple slave nodes can greatly increase the concurrency of the Redis server. Note: Redis read-write separation is not completely real-time synchronization. There may be a certain delay in the data in the slave node. Scenarios with high real-time requirements need to access the master node.

Implementation principle

When the Redis master-slave server first establishes a connection, full replication is performed; after full replication is completed, incremental replication is performed. A slave node can initiate full replication at any time.

There are two main types of Redis master-slave synchronization: full synchronization and incremental synchronization.

Full synchronization

When a new slave node joins the cluster, the first master-slave synchronization is full synchronization.

Processing process

Redis full synchronization is mainly divided into three stages: establishing connection and negotiating synchronization, synchronizing and loading RDB files

, synchronize and load new write commands. as the picture shows:

Main steps for full synchronization:

1. Establish a connection and negotiate synchronization:
- 1) The slave node establishes a master-slave relationship with the master node by sending the replicaof (slaveof before version 5.0) command to the master node.
- 2) The slave node sends the psync? -1 command to the master node.
- 3) The master node receives the psync command from the slave node and responds + FULLRESYNC {runId} {offset} to the slave node. The slave node records the runid and offset of the master node based on the response information.
2. Synchronize and load RDB files
- 1) The master node executes the bgsave command to fork out a child process to read the memory data and generate the corresponding RDB file, and then writes the client write command received thereafter into the replication buffer (replication_buffer) corresponding to the slave node.
- 2) Send the generated RDB file to the slave node.
- 3) The slave node receives the RDB file from the master node, stores it on the local disk, clears the data in the local memory, and executes the load command to load the data in the snapshot file into the local memory.
- 4) After the loading is completed, send a confirmation message to the master node to notify the master node that the RDB file is loaded.
3. Synchronize and load the new write command:
- 1) The master node receives the confirmation message sent from the slave node and sends the write command recorded in the copy buffer to the slave node.
- 2) The slave node receives the new write command from the master node’s copy buffer and writes it into the local memory.

Note: During full replication, write operations on the primary node will be blocked until the full replication is completed. Therefore, when the amount of data is large, full replication may take a long time and affect the performance of the master node.

psync command

psync command format:

psync {runId} {offset}

in:

runId represents the running ID of the master node copied from the slave node.
offset represents the offset of the data currently copied from the slave node.

psync execution process, as shown in the figure:

Flow Description:

The slave node sends psync {runId} {offset} to the master node. The runId is the running ID of the target master node. If it is the first replication, since the runId of the main library is not known, the runId is set to:? , offset is the replication offset saved from the node, if it is the first replication, it is -1 (indicating full replication).

The master node will return the result based on the runId and offset carried in the slave node request:

If you reply + FULLRESYNC {runId} {offset}, the slave node will trigger the full copy process.
If +CONTINUE is replied, the slave will trigger incremental replication.
If the reply is +ERR, it means that the master node does not support the psync command and will use sync to perform full replication.

Master-slave node replication offset

1) The master and slave nodes participating in replication will maintain their own replication offsets.
2) After the master node processes the write command, it will make a cumulative record of the byte length of the command. The statistical information is in the master_repl_offset indicator in info replication.
3) The slave node reports its own replication offset (slave_repl_offset) to the master node every second, and the master node saves the slave node’s replication offset.
4) After receiving the command sent by the master node, the slave node will accumulate its own offset. The statistical information is in the slave_repl_offset indicator in info replication.
5) By comparing the replication offsets of the master and slave nodes, you can determine whether the data of the master and slave nodes are consistent.

Primary node replication backlog buffer

1) The replication backlog buffer (repl_backlog_buffer) is a fixed-length first-in-first-out queue stored on the primary node, with a default value of 1MB.
2) The replication backlog buffer is created when the slave node is connected. When the master node processes the write command, it will send the write command to the slave node, and at the same time, the write command will also be written to the replication backlog buffer.
3) The replication backlog buffer is mainly used for incremental synchronization, and related information can be viewed through info replication.

Master node runId

1) When each Redis node starts, a 40-bit runId will be generated.
2) The main function of runId is to identify Redis nodes. For example: using the ip + port method, if the master node restarts and modifies the RDB/AOF data, there will be risks in the slave node replicating based on the offset before restarting. Therefore, when the runId changes, the slave node will perform full replication (after Redis restarts, the slave node will perform full replication by default).

Important parameters

repl-timeout: Indicates the data replication timeout, the default is 60s. During the full synchronization process, if the RDB file is too large (for example, more than 6GB) and the transmission time exceeds the time set by the parameter repl-timeout, full replication will occur. If it fails, you can increase the repl-timeout parameter value appropriately.
client-output-buffer-limit: Indicates the client output cache limit. The default is 0, which means no limit. If the set value is not 0, when the set threshold is reached, the connection will be disconnected and the memory will be released.
Configuration rule description:
- client-output-buffer-limit normal 0 0 0: Indicates that for ordinary clients, this parameter limit is turned off.
- client-output-buffer-limit slave 256MB 64MB 60: Indicates that for the slave node client, if the output cache memory usage reaches 256M or exceeds 64M for 60s, the slave node connection will be closed.
- client-output-buffer-limit pubsub 32mb 8mb 60: Indicates that for the Pub/Sub client, if the output cache memory usage reaches 32M or exceeds 8M for 60s, the client connection will be closed.

Command propagation

After the master-slave replication is completed, every write operation received by the master node will be sent to the slave node through the replication buffer (replication_buffer) to ensure that the data of the master and slave nodes are consistent.

Incremental synchronization

In versions prior to Redis 2.8, when the network connection between the master and slave nodes is disconnected, even if only a small amount of data is not synchronized to the slave node, full replication will be triggered after the slave node reconnects to the master node. Starting from Redis version 2.8, incremental data transmission is supported. When the slave node reconnects to the master node, the master node only needs to send the commands executed during the disconnection to the slave node, without the need for full replication.

Processing process

Redis master-slave incremental synchronization is mainly divided into two stages: establishing connection and negotiating synchronization, synchronizing and loading new write commands. as the picture shows:

The main steps of incremental synchronization:

1. Establish a connection and negotiate synchronization:
- 1) The slave node establishes a master-slave relationship with the master node by sending the replicaof command to the master node.
- 2) The slave node sends the psync {runId} {offset} command to the master node.
- 3) The master node receives the psync command from the slave node, and determines whether to perform incremental copy or full copy based on the runId and slave_repl_offset carried when the slave node sends the psync command. The basis for judgment is: if the runId sent by the slave node is the same as the runId of the current master node If it is consistent and the slave_repl_offset sent by the slave node does not exceed the length of the master node’s replication backlog buffer (repl_backlog_buffer), it will respond to the slave node with the + CONTINUE command to start incremental replication, otherwise full copy will be performed.
2. Synchronize and load the new write command:
- 1) The master node obtains the new write command from the replication buffer (replication_buffer) based on the slave_repl_offset carried when the slave node sends the psync command and sends it to the slave node.
- 2) The slave node receives the new write command from the master node’s copy buffer and writes it into the local memory. And update the locally stored offset to the latest offset.

Note: repl_backlog_buffer only determines whether to perform incremental synchronization when the slave library is disconnected and reconnected (ie: slave_repl_offset is in repl_backlog_buffer). The new write command is sent to the slave library through the copy buffer.

The difference between copy buffer and copy backlog buffer

Replication buffer (replication_buffer) and replication backlog buffer (repl_backlog_buffer) are concepts related to master-slave synchronization, but their functions and implementation methods are different:

The replication buffer means that the master node first writes the data to be copied into the buffer and waits for the slave node to connect before sending the data to the slave node.
The replication backlog buffer means that when the slave node disconnects from the master node, in order to avoid data loss, the data to be copied is first stored in the buffer and waits for the connection to be restored before sending the data to the master node.

The role of both is to ensure the reliability and data integrity of master-slave synchronization. If the network disconnection time between the slave node and the master node is too long, the replication backlog buffer may be overwritten by newly written commands. At this time, the slave node has no way to perform incremental replication with the master node, but can only perform full replication. copy. To avoid this problem, increase the size of the replication backlog buffer. The size of the replication buffer can be set through the parameter replication_buffer.

Diskless synchronization

During the full replication process, the master node will save the data in the RDB file on the disk and then send it to the slave node. If the disk space on the master node is limited or a relatively slow disk is used, this operation will bring a relatively high cost to the master node. A lot of pressure. After Redis version 2.8, diskless replication can be used to reduce the impact of disk space. The master node opens a socket, creates an RDB file in the memory, and then sends the RDB file to the slave node without using the disk as intermediate storage. .

Diskless replication is generally used when disk space is limited but the network is in good condition.

Diskless copy related parameters:

repl-diskless-sync: Whether to enable diskless replication.
repl-diskless-sync-delay: The default is 5 seconds. Wait for a certain period of time before starting replication. The purpose is to wait for more slave nodes to connect.

Master-slave heartbeat mechanism

After the master-slave node establishes a connection, it will maintain each other’s heartbeat through a long connection.

Master-slave node heartbeat detection mechanism:

1) The master and slave nodes have heartbeat detection mechanisms for each other, and each simulates the client of the other party to communicate. You can view replication-related client information through the client list command. The connection status of the master node is flags=M, and the connection status of the slave node is flags=M. flags=S.
2) The master node sends a ping command to the slave node every 10 seconds by default to determine the survival status of the slave node. The sending frequency can be controlled through the parameter repl-ping-slave-period.
3) The slave node sends the replconf ack {offset} command every 1 second in the main thread to report its current replication offset to the master node. The role of the replconf command:
- Implement monitoring of the network status of the master and slave nodes, report its own replication offset, and check whether the replicated data is lost. If the slave node data is lost, the lost data is pulled from the replication buffer of the master node. The minimum number of active slave nodes guaranteed and the maximum delay allowed by active slave nodes are set through the min-slaves-to-write and min-slaves-max-lag parameters:
  - min-slaves-to-write 2: Indicates that when the number of active slave nodes is less than 2, the master write function is forcibly turned off and data synchronization is stopped.
  - min-slaves-max-lag 10: Indicates that when the delay time of the active slave node is >= 10s, the master write function is forcibly turned off and data synchronization is stopped.
- The master node determines the timeout time of the slave node based on the replconf command, which is reflected in the lag information of the info replication statistics. lag represents the number of seconds of the last communication delay with the slave node. The normal delay should be between 0 and 1. If the value set by the repl-timeout parameter is exceeded (default is 60s), the slave node is determined to be offline and the replication client connection is disconnected. Even if the master node determines that the slave node is offline, if the slave node recovers, the heartbeat detection will continue.

Advantages and disadvantages of master-slave synchronization

Advantages of master-slave synchronization:

Data backup and recovery: The slave node can be used as a data backup to ensure rapid recovery when the master node goes down or loses data.
Read and write separation: The slave node can process read requests, reduce the pressure on the master node, and improve the service read and write performance.
High availability: When the master node goes down, the sentinel mechanism is used to automatically switch from the slave node to the master node to continue providing services.
Scalability: By adding new slave nodes, horizontal expansion can be achieved and the cluster processing capacity can be increased.

Disadvantages of master-slave synchronization:

Delay: When the master node processes data, it needs to broadcast data to all slave nodes, and there will be a certain network transmission delay.
Consistency issue: Redis uses an asynchronous replication mechanism. When the master node modifies the data but has not synchronized it to the slave node, if the master node goes down at this time, data inconsistency may occur.