Redis persistence and cluster

Redis can realize the persistence function by saving data in the disk to prevent a large amount of data loss caused by downtime. But persistence can only guarantee that the data can be saved for a long time when the disk is not damaged. If the disk is damaged, the data is still will be lost. In order to solve this problem, master-slave replication was born.

Master-slave replication is a total in the Redis cluster, one of the Redis is the master, and the other Redis is the slave. The master can read and write to the outside world, and will periodically copy the data to the slave node, and the slave node can only be read externally. In this way, There are multiple hosts storing the same data. Once the master node goes bad, the data of the slave nodes will not be affected, and the slave nodes can be read by the outside world, so the separation of reading and writing is achieved, and the concurrency capability is also improved. But if the master node Once down, it is necessary to manually upgrade a slave node to the master node, so the availability of the system is not high.

The sentinel mode improves the availability of the system. When the master node goes down, it can automatically upgrade a slave node to the master node without manual intervention. However, since there is only one master node, the master node needs to regularly copy data to the slave node. If there are too many slave nodes, the load on the master node will be heavy, so the horizontal scalability is not strong.

The Redis cluster cluster solves the problem of poor horizontal scalability. The Redis cluster cluster is a decentralized system. It has multiple master nodes, and each master node corresponds to multiple slave nodes. The master node can read and write , and the written data will be distributed to each master node through a certain algorithm. The slave nodes just copy the content of the corresponding master node. Once a master node goes down, one of the corresponding slave nodes will automatically upgrade as the master node.

A Redis persistence

All redis data is in memory. If there is a sudden shutdown, all data will be lost. Therefore, persistence is needed to ensure that Redis data will not be lost due to failure. When redis restarts, you can reload the persistent file to restore the data;

Redis has four persistence methods: aof, aof rewrite, rdb and hybrid persistence. By default, only rdb persistence is enabled.

1 aof (append only file)
The aof log stores the sequential instruction sequence of the Redis server, and the aof log only records the instruction records for memory modification;

img

Redis restore the state of the memory data structure of the current instance of Redis by replaying (replay) the sequence of instructions in the aof log;

The strategy of brushing can be divided into:

# 1. Each command flash disk
# appendfsync always
# 2. Refresh every second
# appendfsync everysec
# 3. Hand over to the system to flash the disk
# appendfsync no

These can be configured in the configuration file

Disadvantages of this persistence method:
As time goes on, the aof log will become longer and longer. If redis restarts, it will be very time-consuming to replay the entire aof log, causing redis to be unable to provide external services for a long time.

2 aof rewrite

The aof persistence strategy will persist all modification commands; many commands in it can actually be merged or deleted; aof rewrite is to simplify the commands in the aof file.

aof rewrite On the basis of aof, if a certain strategy is met, the fork process will be converted into a series of redis commands according to the current memory status and the existing AOF file, serialized into a new aof log file, and then the operation will be performed after serialization The incremental aof log that occurs during this period is appended to the new aof log file, and the old aof log file is replaced after the addition; in order to achieve the purpose of slimming the aof log.

Note: The prerequisite for enabling aof rewrite is to enable aof.

img

Configuration method:

# open aof
append only yes
# Enable aof replication
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
# Turn off hybrid persistence
aof-use-rdb-preamble no
# close rdb
save ""

Strategy:

# 1. Redis will record the size of the last aof replication, if the accumulated size exceeds the original size, aof replication will occur;
auto-aof-rewrite-percentage 100
# 2. In order to avoid multiple occurrences of aof replication in strategy 1 when the amount of data is small, strategy 2 needs to exceed 64mb on the premise of satisfying strategy 1
#Only aof replication will occur;
auto-aof-rewrite-min-size 64mb

Disadvantages: aof replication achieves slimming based on aof, but the amount of data copied by aof is still large; loading will be very slow

3rdb

Based on the shortcomings of aof or aof copying files, rdb is a kind of snapshot persistence, which is also the default persistence method of Redis. It persists the data key-value pairs in the memory to the rdb file in the child process by forking the main process. rdb stores compressed binary data.

img

Configuration:

# Turn off aof and also turn off aof replication
append only no
# Turn off aof replication
auto-aof-rewrite-percentage 0
# Turn off hybrid persistence
aof-use-rdb-preamble no
# Open rdb is comment save ""
# save ""
# save 3600 1
# save 300 100
# save 60 10000

Strategy:

# The default strategy of redis is as follows:
# Note: Multiple save strategies are written, only one needs to be satisfied to enable rdb persistence
# 1 modification within 3600 seconds
save 3600 1
# 100 modifications in 300 seconds
save 300 100
# 10000 modifications in 60 seconds
save 60 10000

Disadvantage: If rdb persistence is used, once redis goes down, redis will lose data for a period of time. RDB needs to frequently fork sub-processes to save the data set to the hard disk. When the data set is relatively large, the fork process is very time-consuming, which may cause Redis to fail to respond to client requests within milliseconds. If the data set is huge and the CPU performance is not very good, this situation will last for 1 second, and AOF also needs to fork, but you can adjust the frequency of rewriting log files to improve the durability of the data set.

4 Hybrid Persistence

From the above, we know that the rdb file is small and loads quickly but loses more, and the aof file is large and loads slowly but loses less. Hybrid persistence is a persistence scheme that absorbs the advantages of both rdb and aof. When aof is overwritten, the actual persistent content is rdb. After persistence, the data modified during persistence is appended to the end of the file in the form of aof. Hybrid persistence is actually optimized on the basis of aof rewrite. So you need to enable aof rewrite first.

img

img

configuration

# open aof
append only yes
# Enable aof replication
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
# Enable hybrid persistence
aof-use-rdb-preamble yes
# close rdb
save ""
# save 3600 1
# save 300 100
# save 60 10000

Two Redis master-slave replication

Redis master-slave replication is the principle of the cluster, which is to improve data reliability and separate from reading and writing. It is a data replication and synchronization mechanism for replicating data from one Redis instance (master node) to other Redis instances (slave nodes). In master-slave replication, the master node is responsible for processing the write operation and copying the result of the write operation to one or more slave nodes, and the slave node is responsible for receiving the data sent by the master node and copying it to the local, thus maintaining the same data as the master node. Data Consistency.

The main purpose of master-slave replication is to achieve data redundancy backup, read-write separation, and horizontal expansion, so as to prevent the disk where the master redis resides from being damaged, resulting in permanent data loss. Asynchronous replication is adopted between master and slave. The slave database has the read-only attribute.

# redis.conf
replicaof 127.0.0.1 7002 #Select target primary node

Data synchronization in master-slave replication is divided into full data synchronization and incremental data synchronization, which are:

  • Full data synchronization: Full data synchronization is the initial data synchronization stage when the master-slave replication is established. At this stage, the master node will send its own complete data set to the slave node to ensure that the slave node has the same data as the master node. Full data synchronization can be achieved in the following ways:

    • Snapshot (Snapshot): The master node will generate a snapshot file containing the current data set, and then send the snapshot file to the slave node. After receiving the snapshot file from the node, it loads the file and restores the data. Its advantage is high efficiency, but its disadvantage is large delay and high requirements for network bandwidth and storage space
    • Command Replication (Command Replication): The master node will record the write operation command executed on itself and send it to the slave node. After the slave node receives the commands, it executes these commands in order to restore the data. Its advantages are good real-time performance and low data delay, but its disadvantages are low efficiency, high complexity, and the need to deal with issues such as command concurrency and consistency.

    Which synchronization method to choose is set through the configuration file.

  • Incremental Sync (Incremental Sync): After full data synchronization is completed, incremental data synchronization will be performed between the master and slave nodes. Incremental data synchronization means that the master node sends write operation commands to the slave nodes, so that the slave nodes update their own data in real time to maintain consistency with the master node. Incremental data synchronization is asynchronous, and the master node will send the command to the slave node, but the speed at which the slave node executes the command may be affected by factors such as network delay. During incremental data synchronization, if the connection between the slave node and the master node is lost, the slave node will try to reconnect and continue to synchronize the lost data.

Flow chart of full data synchronization:

img

Flow chart of incremental data synchronization:

img

Server RUN ID: No matter the master library or the slave library has its own RUN ID, the RUN ID is automatically generated when it starts, and the RUN ID consists of 40 random hexadecimal characters. When the slave library copies the master library for the first time, the master library will transmit its own RUN ID to the slave library, and the slave library will save the RUN ID.

When the slave library disconnects and reconnects to the master library, the slave library will send the previously saved RUN ID to the master library.

  • The RUN ID of the slave library is consistent with the RUN ID of the master library, indicating that the current master library was copied before the slave library was disconnected; the master library tries to perform incremental synchronization operations;
  • If it is not consistent, it means that the master library copied before the slave library is disconnected is not the current master library, and the master library will perform a full synchronization operation on the slave library.

Copy offset: is a value used to identify the progress of the copy. It represents the data synchronization location between master and slave nodes. Both the master and the slave maintain a replication offset. When the slave node connects to the master node for replication, it will send a replication request to the master node and provide its own replication offset. The master node will set the slave node to synchronize the data after this offset.

Its synchronization method is:

  • When the master library sends N bytes of data to the slave library, it adds N to its copy offset.
  • When the slave library receives N bytes of data sent by the master library, add N to its copy offset.

By comparing the master-slave offset, we can know whether the data between the master and the slave is consistent; if the offset is the same, the data is consistent; if the offset is different, the data is inconsistent.

Ring buffer (replication backlog buffer): the essence is a fixed-length first-in-first-out queue;
Storage content: When the slave library is disconnected from the main library due to some reasons (network jitter or slave library downtime), to avoid full synchronization after reconnection, a ring buffer is set in the main library; the buffer will be stored in the slave library During the period when the library is disconnected, the write operations of the master library are accumulated; when the slave library is reconnected, it will send its own replication offset to the master library, and the master library will compare the replication offsets of the master and slave:

  • If the slave library offset is still in the copy backlog buffer, perform incremental synchronization;
  • Otherwise, the master library will perform full synchronization on the slave library;

Three Redis Sentinel Mode

The sentinel mode of Redis is mainly to solve the high availability problem of Redis cluster. Compared with high availability, it can automatically switch the down master node.

Sentinel mode is a sentinel system composed of one or more sentinel instances. sentinal and redis are two independent programs. The system can monitor any number of master libraries and the slave libraries to which these master libraries belong. When the main library is offline, a slave library to which the main library belongs is automatically upgraded to a new main library.

When the client connects to the cluster, it will first connect to sentinel, query the address of the master node through sentinel, and then connect to the master node for data interaction. When the primary node fails, the client will ask sentinel for the address of the main library again, and sentinel will tell the client the latest address of the main library. In this way, the client can automatically complete node switching without restarting.

The sentinel mode involves multiple election processes using the implementation of the leader election method of the Raft algorithm;

Schematic:

img

Configuration:

# sentinel.cnf
# sentinel only needs to specify the detection master node, and the slave node will be automatically discovered through the master node
sentinel monitor mymaster 127.0.0.1 6379 2
# Determine the subjective offline time
sentinel down-after-milliseconds mymaster 30000
# Specify how many Redis services can synchronize new hosts. Generally speaking, the smaller the number, the longer the synchronization time, and the larger the number, the better the network
# The higher the network resource requirement
sentinel parallel-syncs mymaster 1
# Specify the number of milliseconds allowed for failover. If it exceeds this time, the failover will be considered as a failure. The default is 3 minutes
sentinel failover-timeout mymaster 180000

Anomaly Detection:
Subjective offline:
Sentinel will send ping messages to all nodes (other sentinels, master nodes, and slave nodes) once per second, and then judge whether the node is offline by receiving the return; if it is within the time specified by the configuration down-after-milliseconds, it will be It is judged as subjective offline.

Objective offline:
When a sentinel node judges a master node as being offline subjectively, in order to confirm whether the master node is really offline, it will ask other sentinel nodes. If a certain number of offline replies are received, sentinel will send the master node The node is determined to be objectively offline, and performs failover to the primary node through the leading sentinel node;

Failover:
After the master node is judged to be objectively offline, it starts the leader sentinel election, which requires the support of more than half of the sentinels. After the leader sentinel is elected, it starts to fail over the master node.

  • Elect a slave node as the new master node from the slave nodes
  • Inform other slave nodes to replicate and connect to the new master node
  • If the failed master node reconnects, it will serve as the slave node of the new master node

How to use:

  1. Connect a sentinel node and get the master node information;
    SENTINEL GET-MASTER-ADDR-BY-NAME
  2. Verify the currently acquired master node.
    ROLE or INFO REPLICATION
  3. For the currently connected sentinel node, add a publish-subscribe ( PUB/SUB ) connection and subscribe to the + switch-master channel.

shortcoming:
Redis uses asynchronous replication, which means that when the master node hangs up, the slave node may not receive all the synchronization messages, and this part of the unsynchronized messages will be lost. If the master-slave delay is particularly large, then the loss may be particularly large. Sentinel cannot guarantee that messages will not be lost at all, but it can be configured to ensure as little loss as possible.

# The main library must have a slave node for normal replication, otherwise the main library will stop the external writing service, and the availability will be lost at this time
min-slaves-to-write 1
# This parameter is used to define what is normal replication. This parameter means that if no feedback is received from the slave library within 10s, it means that the slave library synchronization is not normal;
min-slaves-max-lag 10

At the same time, its fatal shortcoming is that it cannot be extended horizontally, because all write operations need to be completed through group nodes, and too many Redis nodes will cause the master load to be too high.

Four Redis clusters

Redis cluster divides all data into 16384 (2^14) slots, and each redis node is responsible for a part of the slots. The cluster cluster is a decentralized clustering method.

As shown in the figure, the cluster consists of three redis nodes, each node is responsible for a part of the data of the entire cluster, and the data each node is responsible for may be different. These three nodes are connected to each other to form a peer-to-peer cluster, and they exchange cluster information through a special binary protocol.

When the client of redis cluster connects to the cluster, it will get a slot configuration information of the cluster. In this way, when the client wants to find a certain key, it can directly locate the target node.

In order to directly locate the node where a specific key is located (hash the key through crc16 and take the remainder of 2^14), the client needs to cache slot-related information, so that the corresponding node can be located accurately and quickly. At the same time, because there may be inconsistencies in the storage slot information between the client and the server, a correction mechanism is required (by returning -MOVED 3999127.0.0.1:6479, the client needs to correct the local slot mapping table immediately after receiving it) to achieve Checksum adjustment of slot information.

In addition, each node of the redis cluster will use the cluster’s

The configuration information is persisted to the configuration file, which requires ensuring that the configuration file is writable, and try not to rely on manual modification of the configuration file.

img

Data Migration
redis cluster provides a tool redis-trib that allows operation and maintenance personnel to manually adjust slot allocation. It is developed in ruby language and implemented by combining native redis cluster commands. In the figure: A is the source node to be migrated, and B is the target node to be migrated.

img

process:
As shown in the figure above: the unit of redis migration is the slot, and redis migrates slot by slot. When a slot is being migrated, the slot is in an intermediate transition state. The status of this slot is migrating at the source node and importing at the target node, indicating that data is flowing from the source node to the target node at this time.

The migration tool redis-trib first sets the intermediate transition state on the source node and the target node, then obtains all or part of the key list of the source node slot at one time, and then migrates the keys in turn. The source node executes the dump command on the current key to obtain the serialized content, and then sends the restore command to the target node, and the target node deserializes the serialized content of the source node and applies the content to the content of the target node, and then returns + ok To the source node, the source node deletes the key after receiving it; follow these steps to migrate all the keys to be migrated.

Note: The migration process is synchronous. During the migration process, the main thread of the source node is blocked until the key is deleted. If the source node has a network failure during the migration process, the two nodes are still in an intermediate state. After restarting, redis-trib can continue to migrate.

Therefore, the redis-trib migration process is performed key by key. If the key corresponds to a large val content, it will affect the normal access of the client.

Replication and failover
The nodes in the cluster cluster are divided into master nodes and slave nodes. The master node is used to process slots, while the slave node is used to replicate the master node and continue to process command requests instead of the master node when the master node goes offline.

Failure Detection
Each node in the cluster will periodically send ping messages to other nodes in the cluster. If the node that receives the ping message does not reply to the pong message within the specified time, the node that does not reply to the pong message will be marked as PFAIL (probable fail).

Each node in the cluster will exchange the status information of each node in the cluster by sending messages to each other; if in a cluster, more than half of the master nodes responsible for processing slots report a certain master node A as suspected offline, then this Master node A will be marked as down (FAIL); the master node that marks master node A as offline will broadcast this message, and other nodes (including slave nodes of node A) will also mark node A as FAIL;

Failover
When the slave node finds that its master node has entered the FAIL state, the slave node will start to failover the offline master node;

  1. Elect the master node from the slave node with the latest data;
  2. The slave node will execute the replica no one command, called the new master node;
  3. The new master node will revoke all slot assignments to the offline master node, and assign all these slots to itself;
  4. The new master node broadcasts a pong message to the cluster, which lets other nodes in the cluster know immediately
  5. This node has changed from a slave node to a master node, and this master node has taken over the previously offline master node;
  6. The new master node starts to receive command requests related to the slots it is responsible for processing, and the failover ends;

Cluster configuration method:
Create a folder:

# create 6 folders
mkdir -p 7001 7002 7003 7004 7005 7006
cd 7001
vi 7001.conf
# The content in 7001.conf is as follows

The configuration method of a single Redis in the cluster:

pidfile "/home/chengjun/redis-data/7001/7001.pid"
logfile "/home/chengjun/redis-data/7001/7001.log"
dir /home/chengjun/redis-data/7001/
port 7001
daemonize yes
cluster-enabled yes
cluster-config-file nodes-7001.conf
cluster-node-timeout 15000

Copy the configuration to multiple copies, and start multiple Redis processes.

Create a cluster manually:

# node meeting
cluster meet ip port
# allocate slots
cluster adds slots slot
# Assign master and slave
cluster replicate node-id

Intelligent cluster creation:

redis-cli --cluster help
# The corresponding parameter after --cluster-replicas is one master corresponding to several slave databases
redis-cli --cluster create host1:port1 ... hostN:portN --cluster-replicas
<arg>
redis-cli --cluster create 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 127.0.0.1:7006 --cluster-replicas 1

Article reference and ‘s C/C + + linux service period advanced architecture system tutorial learning: link