7 postures for the correct use of Redis distributed locks

Foreword

In daily development, distributed locks are required for business scenarios such as placing orders in seconds and grabbing red envelopes. And Redis is very suitable for use as a distributed lock. This article will be divided into seven solutions to discuss the correct use of Redis distributed locks. If there is something wrong, welcome to point it out, let’s learn together and make progress together.

  • What is a distributed lock

  • Solution 1: SETNX + EXPIRE

  • Solution 2: SETNX + value is (system time + expiration time)

  • Solution 3: Use Lua script (including SETNX + EXPIRE two instructions)

  • Solution 4: SET extended command (SET EX PX NX)

  • Solution 5: SET EX PX NX + verify the unique random value, and then release the lock

  • Solution 6: Open source framework: Redisson

  • Solution 7: Distributed lock Redlock implemented by multiple machines

What is a distributed lock

A distributed lock is actually an implementation of a lock that controls different processes in a distributed system to access shared resources. If different systems or different hosts of the same system share a certain critical resource, mutual exclusion is often required to prevent mutual interference and ensure consistency.

Let’s first look at the characteristics of a reliable distributed lock:

  • Mutual exclusion: Only one client can hold the lock at any time.
  • Lock timeout release: The lock can be released after timeout, preventing unnecessary waste of resources and deadlock.
  • Reentrancy: If a thread acquires a lock, it can request a lock again.
  • High performance and high availability: The cost of locking and unlocking needs to be as low as possible, while ensuring high availability to avoid failure of distributed locks.
  • Security: the lock can only be deleted by the client that holds it, and cannot be deleted by other clients

Redis Distributed Lock Solution 1: SETNX + EXPIRE

When it comes to the distributed lock of Redis, many friends will immediately think of the setnx + expire command. That is, use setnx to grab the lock first, and then use expire to set an expiration time for the lock to prevent the lock from being forgotten to release.

SETNX is the abbreviation of SET IF NOT EXISTS. The daily command format is SETNX key value. If the key does not exist, SETNX returns 1 successfully, and if the key already exists, it returns 0.

Assuming that a product on an e-commerce website is doing flash sales, the key can be set to key_resource_id, and the value can be set to any value. The pseudo code is as follows:

if (jedis.setnx(key_resource_id,lock_value) == 1){ //lock
    expire(key_resource_id, 100); //set the expiration time
    try {
        do something //business request
    }catch(){
}
finally {
       jedis.del(key_resource_id); //release the lock
    }
}
Copy Code

But in this solution, the two commands setnx and expire are separated, not an atomic operation. If after executing setnx to lock and about to execute expire to set the expiration time, the process crashes or needs to be restarted for maintenance, then this lock will be “immortal”, Other threads will never acquire the lock.

Redis distributed lock scheme 2: SETNX + value is (system time + expiration time)

For solution one, The scene where the lock cannot be released, some friends think that the expiration time can be put in the value of setnx. If the locking fails, just take out the value and check it again. The lock code is as follows:

long expires = System.currentTimeMillis() + expireTime; //system time + set expiration time
String expiresStr = String. valueOf(expires);

// If the current lock does not exist, return the lock successfully
if (jedis. setnx(key_resource_id, expiresStr) == 1) {
        return true;
}
// If the lock already exists, get the expiration time of the lock
String currentValueStr = jedis. get(key_resource_id);

// If the obtained expiration time is less than the current system time, it means it has expired
if (currentValueStr != null & amp; & amp; Long. parseLong(currentValueStr) < System. currentTimeMillis()) {

     // The lock has expired, get the expiration time of the previous lock, and set the expiration time of the current lock (for those who don't know the getSet command of redis, you can go to the official website to see it)
    String oldValueStr = jedis.getSet(key_resource_id, expiresStr);
    
    if (oldValueStr != null & amp; & amp; oldValueStr. equals(currentValueStr)) {
         // Considering the situation of multi-thread concurrency, only one thread whose set value is the same as the current value can be locked
         return true;
    }
}
        
//Other cases, return lock failure
return false;
}
Copy Code

The advantage of this solution is that it cleverly removes the operation of expire to set the expiration time separately, and puts the expire time in the value of setnx. Solved the problem that the lock cannot be released when an exception occurs in solution one. But this solution has other disadvantages:

  • The expiration time is generated by the client itself (System.currentTimeMillis() is the current system time). In a distributed environment, the time of each client must be synchronized.
  • If multiple clients request at the same time when the lock expires, they all execute jedis.getSet(). In the end, only one client can be locked successfully, but the expiration time of the client lock may be overwritten by other clients.
  • The lock does not store the unique identifier of the holder and may be released/unlocked by other clients.

Redis distributed lock scheme three: use Lua script (including SETNX + EXPIRE two instructions)

In fact, we can also use Lua scripts to ensure atomicity (including two instructions setnx and expire), the lua scripts are as follows:

if redis.call('setnx',KEYS[1],ARGV[1]) == 1 then
   redis. call('expire', KEYS[1], ARGV[2])
else
   return 0
end;
Copy Code

The lock code is as follows:

 String lua_scripts = "if redis. call('setnx',KEYS[1],ARGV[1]) == 1 then" +
            "redis. call('expire',KEYS[1],ARGV[2]) return 1 else return 0 end";
Object result = jedis.eval(lua_scripts, Collections.singletonList(key_resource_id), Collections.singletonList(values));
//Determine success
return result. equals(1L);
Copy Code

This solution still has shortcomings. As for the shortcomings, you should think about them first. You can also think about it. Which is better compared with option 2?

Redis distributed lock scheme four: SET extended command (SET EX PX NX)

In addition to using Lua scripts to ensure the atomicity of the two instructions SETNX + EXPIRE, we can also use the Redis SET instruction to expand parameters! (SET key value[EX seconds][PX milliseconds][NX|XX]), it is also atomic!

SET key value[EX seconds][PX milliseconds][NX|XX]

  • NX: It means that the set can only succeed when the key does not exist, that is, it is guaranteed that only the first client request can obtain the lock, and other client requests can only be obtained after the lock is released.
  • EX seconds : Set the expiration time of the key, and the time unit is seconds.
  • PX milliseconds: Set the expiration time of the key in milliseconds
  • XX: set the value only when the key exists

The pseudocode demo is as follows:

if (jedis.set(key_resource_id, lock_value, "NX", "EX", 100s) == 1){ // lock
    try {
        do something //business processing
    }catch(){
}
finally {
       jedis.del(key_resource_id); //release the lock
    }
}
Copy Code

However, there may still be problems with this solution:

  • Question 1: The lock expired and was released, and the business has not been executed yet. Assuming that thread a successfully acquires the lock, it has been executing the code in the critical section. But after 100s passed, it has not finished executing. However, the lock has expired at this time, and thread b requests it again at this time. Obviously, thread b can successfully acquire the lock and start executing the code in the critical section. Then the problem comes, the business code in the critical section is not executed strictly serially.
  • Question 2: The lock is accidentally deleted by another thread. Assume that after thread a finishes executing, it releases the lock. But it doesn’t know that the current lock may be held by thread b (when thread a releases the lock, the expiration time may have expired, and thread b comes in to occupy the lock). Then thread a releases the lock of thread b, but the business code in the critical section of thread b may not have finished executing yet.

Scheme 5: SET EX PX NX + verify unique random value, then delete

Since the lock may be accidentally deleted by other threads, then we set a random number for the value to mark the uniqueness of the current thread. When deleting, check it and it will be OK. The pseudocode is as follows:

if (jedis.set(key_resource_id, uni_request_id, "NX", "EX", 100s) == 1){ // lock
    try {
        do something //business processing
    }catch(){
}
finally {
       //Judge whether it is the lock added by the current thread, and then release it
       if (uni_request_id. equals(jedis. get(key_resource_id))) {
        jedis.del(lockKey); //release the lock
        }
    }
}
Copy Code

Here, judging whether the current thread has locked it and releasing the lock are not an atomic operation. If you call jedis.del() to release the lock, the lock may no longer belong to the current client, and the lock added by others will be released.

In order to be more rigorous, lua scripts are generally used instead. The lua script is as follows:

if redis.call('get',KEYS[1]) == ARGV[1] then
   return redis.call('del',KEYS[1])
else
   return 0
end;
Copy Code

Redis distributed lock scheme six: Redisson framework

Solution 5 may still have the problem of the lock expires and is released, and the business is not completed. Some friends think that it is enough to set the lock expiration time a little longer. In fact, let’s imagine whether it is possible to start a timing daemon thread for the thread that acquires the lock, and check whether the lock still exists every once in a while. If it exists, the expiration time of the lock will be extended to prevent the lock from being released early.

The current open source framework Redisson solves this problem. Let’s take a look at the underlying schematic diagram of Redisson:

As long as the thread is successfully locked, a watch dog watchdog will be started. It is a background thread and will check every 10 seconds. If thread 1 still holds the lock, it will continue to Extend the lifetime of the lock key. Therefore, Redisson uses Redisson to solve the problem of The lock expires and is released, and the business is not completed.

Redis distributed lock scheme seven: distributed lock Redlock + Redisson implemented by multiple machines

The previous six solutions are only based on the discussion of the stand-alone version, and they are not perfect yet. In fact, Redis is generally deployed in clusters:

If thread one gets the lock on the master node of Redis, but the locked key has not been synchronized to the slave node. At this time, if the master node fails, a slave node will be upgraded to a master node. Thread 2 can acquire the lock of the same key, but thread 1 has already acquired the lock, and the security of the lock is lost.

In order to solve this problem, Redis author antirez proposed an advanced distributed lock algorithm: Redlock. The core idea of Redlock is this:

Do multiple Redis master deployments to ensure that they don’t go down at the same time. And these master nodes are completely independent of each other, and there is no data synchronization between them. At the same time, you need to ensure that the same method is used to acquire and release locks on multiple master instances as on a single instance of Redis.

We assume that there are currently 5 Redis master nodes, and these Redis instances are running on 5 servers.

The implementation steps of RedLock: as follows

  • 1. Get the current time in milliseconds.
  • 2. Request locks from five master nodes in sequence. The client sets the network connection and response timeout period, and the timeout period should be less than the expiration time of the lock. (Assuming that the automatic lock expiration time is 10 seconds, the timeout period is generally between 5-50 milliseconds, let’s assume that the timeout period is 50ms). If it times out, skip the master node and try the next master node as soon as possible.
  • 3. The client uses the current time to subtract the start time of acquiring the lock (that is, the time recorded in step 1) to obtain the time used to acquire the lock. If and only if more than half (N/2 + 1, here is 5/2 + 1=3 nodes) of the Redis master nodes have acquired the lock, and the use time is less than the lock expiration time, the lock is considered successful. (As shown above, 10s> 30ms + 40ms + 50ms + 4m0s + 50ms)
  • If the lock is obtained, the real effective time of the key will change, and the time used to acquire the lock needs to be subtracted.
  • If the lock acquisition fails (the lock has not been acquired in at least N/2 + 1 master instances, or the lock acquisition time has exceeded the effective time), the client must unlock all master nodes (even if some master nodes do not have If the lock is successful, it also needs to be unlocked to prevent some people from slipping through the net).

The simplified steps are:

  • Request locks from 5 master nodes in sequence
  • Judging according to the set timeout period, whether to skip the master node.
  • If more than or equal to three nodes are successfully locked, and the time used is less than the validity period of the lock, it can be determined that the lock is successful.
  • If acquiring the lock fails, unlock it!

Redisson has implemented the redLock version of the lock. Interested friends, you can go and find out~