What are the problems with Redis distributed locking and how to solve them?

Suppose there is a scenario where you buy a ticket on a ticket purchasing software, but there are only one or a few tickets left. At this time, dozens of people are using this software to purchase tickets at the same time. Without considering any impact, the normal logic is to first determine whether there are any remaining tickets. If so, then purchase and deduct the inventory. Otherwise, it will prompt that the number of tickets is insufficient and the purchase fails. The pseudocode is as follows:

void buyTicket() {
    int stockNum = byTicketMapper.selectStockNum();
    if(stockNum>0){
        //TODO ticket buying process....
        byTicketMapper.reduceStock(); // Reduce inventory
    }else{
        log.info("======>Tickets sold out<====");
    }
}

There is no logical problem with this code, but in a concurrent scenario, there may be a serious problem. When the number of remaining votes is 1, two users A and B click the purchase button at the same time. User A passes the verification that the inventory is greater than 0 and starts to execute the ticket purchase logic. However, due to some reasons, user A’s ticket purchase thread has a short time. of obstruction.

During this blocking process, user B initiated a purchase request and passed the verification that the inventory was greater than 0 until the entire purchase process was completed and the inventory was deducted. Then the remaining inventory at this time is exactly 0, and no more users will initiate purchase requests. At this time, user A’s purchase request block is awakened, because the inventory has been verified to be greater than 0 before, so after the purchase process is completed, the inventory is still will be deducted once. Then the inventory at this time is -1, which is the oversold problem that is often heard.

736b663a91075a9cf36b97e1d68a96c0.png

In order to avoid this problem, we can ensure concurrency security by locking. Like the built-in synchronized lock provided by JVM and the reentrant lock provided by JUC, these two locks can only guarantee concurrency security issues in a stand-alone environment. Generally, single-node projects are rarely deployed in actual work, and they are usually multi-node. In cluster deployment, these two locks lose their meaning. At this time, you can use redis to implement distributed locks.

setnx

In the case of cluster deployment, redis is usually used to implement distributed locks. Redis provides the setnx command, which indicates that the value can be set successfully only when the key does not exist, thereby achieving the locking effect.

The above code is modified through redis. The method is that the ticket purchasing thread first acquires the lock. If the lock is acquired successfully, then the ticket purchasing business process continues until all processes are completed and the inventory is deducted, and finally the lock is released. If acquisition of the lock fails, a friendly system prompt will be given.

void buyTicket() {
    // Get the lock
    Boolean lock = redisTemplate.opsForValue().setIfAbsent("lock", "1");
    if (lock) {
        int stockNum = byTicketMapper.selectStockNum();
        if(stockNum>0){
            //TODO ticket buying process....
            byTicketMapper.reduceStock(); // Reduce inventory
        }else{
            log.info("======>Tickets sold out<====");
        }
        // release lock
        redisTemplate.delete("lock");
    } else {
        log.info("======>The system is busy, please wait!<====");
    }
}

1Problem 1: Deadlock problem

After going through the above session, do you think this is enough? In fact, it is not. Imagine that after Thread A successfully acquires the lock, an exception occurs in the logic of executing the ticket purchase, then the lock will not be released at this time, and other threads will never acquire the lock, which will cause a serious deadlock. question.

In order to avoid deadlock problems, we can capture exceptions and release the lock in finally, so that regardless of whether the business execution succeeds or fails, the lock will be released in the end.

void buyTicket() {
    Boolean lock = redisTemplate.opsForValue().setIfAbsent("lock", "1");
    if (lock) {
        try {
            int stockNum = byTicketMapper.selectStockNum();
            if (stockNum > 0) {
                //TODO ticket buying process....
                byTicketMapper.reduceStock(); // Reduce inventory
            } else {
                log.info("======>Tickets sold out<====");
            }
        }finally {
            redisTemplate.delete("lock"); // Release the lock
        }
    } else {
        log.info("======>The system is busy, please wait!<====");
    }
}

Did you think this was the end? Will deadlock not happen? If you think this can avoid deadlock, you are not careful. If the redis service suddenly crashes when the program is just trying to execute the logic of releasing the lock, then the lock release will fail. After restarting the redis service, the locked data was restored, and a deadlock occurred again.

In order to avoid this problem, you can set an expiration time for the lock, so that it will expire quickly even after redis restarts to restore data. However, it should be noted that when setting the expiration time of the lock, atomic operation must be ensured, otherwise deadlock problems will still occur.

//It is not an atomic operation, and deadlock problems may occur.
Boolean lock = redisTemplate.opsForValue().setIfAbsent("lock", "1");
//If the statement is about to be executed, redis is down. The lock above cannot be released
redisTemplate.expire("lock",Duration.ofSeconds(5L));

//Atomic operation
Boolean lock = redisTemplate.opsForValue().setIfAbsent("lock", "1", Duration.ofSeconds(5L));

2Problem 2: The lock is released by other threads

After another round of studing above, the deadlock problem can be avoided, so that it can be executed safely under high concurrency conditions. If the lock expiration time is set to 5 seconds, when thread A initiates a ticket purchase request and obtains the lock, but thread A takes 6 seconds to execute the ticket purchase process, thread A’s lock has expired.

At this time, thread B reacquired the lock and also started to execute the ticket purchase process, but thread A executed faster than thread B. When thread A released the lock, the problem occurred. Since the process lock executed by thread A has expired, when the process of releasing the lock is executed, the lock of thread B is finally released, which leads to the problem of thread B’s lock being released by thread A.

4f973cd82ffa6e7b46a766716de0475c.png

For this phenomenon, you can set a unique identifier for each lock, such as UUID or thread ID. When releasing the lock, check whether the lock identification is a lock that needs to be deleted. If so, release the lock.

public void buyTicket() {
    String uuid = UUID.randomUUID().toString();
    //Set a unique identifier for the lock
    Boolean lock = redisTemplate.opsForValue().setIfAbsent("lock", uuid, Duration.ofSeconds(5L));
    if (lock) {
        try {
            int stockNum = byTicketMapper.selectStockNum();
            if (stockNum > 0) {
                //TODO ticket buying process....
                byTicketMapper.reduceStock(); // Reduce inventory
            } else {
                log.info("======>Tickets sold out<====");
            }
        }finally {
            String lockValue = redisTemplate.opsForValue().get("lock");
            if(lockValue.equals(uuid)){ //Verify the identifier and release the lock if it passes
                redisTemplate.delete("lock");
            }
        }
    } else {
        log.info("======>The system is busy, please wait!<====");
    }
}

3Question 3: Lock renewal issue

When using the setnx command to do distributed locks, an unavoidable problem is that the thread has not completed execution, but the lock has expired. In the code that solves the problem of locks being accidentally deleted by other threads, it is not 100% solvable. The problem lies in the following code.

If thread A has executed the if statement and passed the judgment, when the logic of releasing the lock is about to be executed, thread A’s lock expires and thread B reacquires the lock, then when thread A releases the lock, it is B’s that is released Lock.

In order to completely solve this problem, lock renewal can be used. The implementation method is to open a separate thread to regularly monitor whether the thread’s lock is still held. If it is still held, then add some expiration dates to the lock. time, so that the above problems will not occur. At present, the market has provided us with middleware for automatic lock renewal, such as redisson

String lockValue = redisTemplate.opsForValue().get("lock");
  if(lockValue.equals(uuid)){ // Thread A's lock expires
      redisTemplate.delete("lock"); // Thread A deletes thread B's lock
   }
Redisson

The most commonly used scenario of redisson is distributed lock. It not only ensures thread safety in concurrent scenarios, but also solves the problem of lock renewal. The usage is also relatively simple. Taking version 3.5.7 as an example, you first need to configure redisson information and freely choose the configuration according to your own redis cluster mode. After the configuration is completed, modify the ticket purchasing method above.

@Bean
public RedissonClient redissonClient() {
    Config config = new Config();
    // Stand-alone configuration
    config.useSingleServer().setAddress("redis://127.0.0.1:3306").setDatabase(0);
    //Master-slave configuration
    // config.useMasterSlaveServers().setMasterAddress("").addSlaveAddress("","");
    // Sentinel configuration
    // config.useSentinelServers().addSentinelAddress("").setMasterName("");
    // Cluster configuration
    //config.useClusterServers().addNodeAddress("");
    return Redisson.create(config);
}

It is also very simple to use redisson. The RLock object is obtained through the getLock method. Locking is performed through the tryLock or lock method of RLock (the bottom layer is implemented through Lua scripts). After the lock is acquired and the inventory is deducted, the unlock method can be used to release the lock.

void buyTicket() {
    RLock lock = redissonClient.getLock("lock");
    if (lock.tryLock()) { // Get the lock
        try {
            int stockNum = byTicketMapper.selectStockNum();
            if (stockNum > 0) {
                //TODO ticket buying process....
                byTicketMapper.reduceStock(); // Reduce inventory
            } else {
                log.info("======>Tickets sold out<====");
            }
        } finally {
            lock.unlock(); //Release the lock
        }
    } else {
        log.info("======>The system is busy, please wait!<====");
    }
}
Watch Dog mechanism

So how does redisson achieve lock renewal? In fact, there is a watchdog mechanism (watchdog mechanism) inside redisson, but the watchdog mechanism is not activated when the lock is locked. It should be noted that when locking, if you use the tryLock(long t1, long t2, TimeUnit unit) or lock(long t1, long t2, TimeUnit unit) method and set the t2 parameter value to one If the value is not -1, then watching the door will not be effective.

After the watchdog is started, it will monitor that the main thread is still executing. If it is still executing, the lock will be renewed for 30 seconds every 10 seconds through a Lua script. The default delay time of watchlog is 30 seconds. This value can be defined by yourself when configuring config.

private RFuture<Boolean> tryAcquireOnceAsync(long leaseTime, TimeUnit unit, final long threadId) {
    if (leaseTime != -1L) { // If leaseTime is not -1, then the watchdog cannot be used
        return this.tryLockInnerAsync(leaseTime, unit, threadId, RedisCommands.EVAL_NULL_BOOLEAN);
    } else {
        RFuture<Boolean> ttlRemainingFuture = this.tryLockInnerAsync(this.commandExecutor.getConnectionManager().getCfg().getLockWatchdogTimeout(), TimeUnit.MILLISECONDS, threadId, RedisCommands.EVAL_NULL_BOOLEAN);
        ttlRemainingFuture.addListener(new FutureListener<Boolean>() {
            public void operationComplete(Future<Boolean> future) throws Exception {
                if (future.isSuccess()) {
                    Boolean ttlRemaining = (Boolean)future.getNow();
                    if (ttlRemaining) {
                        // Look at the door mechanism
                        RedissonLock.this.scheduleExpirationRenewal(threadId);
                    }

                }
            }
        });
        return ttlRemainingFuture;
    }
}
private long lockWatchdogTimeout = 30000L; //default 30 seconds
private void scheduleExpirationRenewal(final long threadId) {
    if (!expirationRenewalMap.containsKey(this.getEntryName())) {
        //Execute renewal every 10 seconds
        Timeout task = this.commandExecutor.getConnectionManager().newTimeout(new TimerTask() {
            public void run(Timeout timeout) throws Exception {
            // Renew the lock through LUA script
                RFuture<Boolean> future = RedissonLock.this.commandExecutor.evalWriteAsync(RedissonLock.this.getName(), LongCodec.INSTANCE, RedisCommands.EVAL_BOOLEAN, "if (redis.call('hexists', KEYS[1], ARGV [2]) == 1) then redis.call('pexpire', KEYS[1], ARGV[1]); return 1; end; return 0;", Collections.singletonList(RedissonLock.this.getName ()), new Object[]{RedissonLock.this.internalLockLeaseTime, RedissonLock.this.getLockName(threadId)});
                future.addListener(new FutureListener<Boolean>() {
                    public void operationComplete(Future<Boolean> future) throws Exception {
                        RedissonLock.expirationRenewalMap.remove(RedissonLock.this.getEntryName());
                        if (!future.isSuccess()) {
                            RedissonLock.log.error("Can't update lock " + RedissonLock.this.getName() + " expiration", future.cause());
                        } else {
                            if ((Boolean)future.getNow()) {
                                RedissonLock.this.scheduleExpirationRenewal(threadId);
                            }

                        }
                    }
                });
            }
        }, this.internalLockLeaseTime / 3L, TimeUnit.MILLISECONDS); // Executed every 10 seconds
        if (expirationRenewalMap.putIfAbsent(this.getEntryName(), task) != null) {
            task.cancel();
        }

    }
}

4Problem 4: Lock loss caused by master-slave switching

Although redisson helps us solve the lock renewal problem, in the redis cluster architecture, due to the certain delay in master-slave replication, in extreme cases there will be such a problem: when a thread acquires the lock successfully, and successfully The lock information is saved to the master node. When the master node has not synchronized the lock information with the slave node, the master node is down. Then when a failover occurs and the slave node switches to the master node, the lock added by the thread is lost.

In order to solve this problem, redis introduced the red lock RedLock. RedLock is similar to the election mechanism of most middleware, using a majority method to determine whether the operation is successful or unsuccessful.

RedLock
  • Lock

At work, RedLock does not accept the cluster architecture of redis, whether it is master-slave, sentinel or cluster. Each redis service is independent and an independent Master node.

During the locking process, RedLock will record the time when the locking starts and the time after the locking is successful. The two time differences are the time required for a machine to be successfully locked. For example, 5 redis services are started, and thread A sets the lock timeout to 5 seconds. When the first redis service is successfully locked, it takes 1 second, and when the second service is successfully locked, it also takes 1 second.

At this time, when adding the second machine, it has already taken two seconds, but the number of locks has not exceeded half. One more machine needs to be locked to be fully considered as a successful lock. At this time, the third machine is successfully locked and it takes another for 1 second. Then the total locking time is 3 seconds, and the actual expiration time of the lock is 2 seconds.

What needs special attention is that when establishing a network connection to the redis service, a timeout must be set to avoid the client still waiting for a response when the redis service goes down. The official recommendation for the timeout here is 5-50 milliseconds. When the connection times out, the client will continue to initiate a connection to the next node.

24690dac222cd717bc86f4a07684ce01.png

  • Lock failed

If for some reason, the lock acquisition fails (the lock is not locked more than half or the lock acquisition time has exceeded the effective time), the client should unlock on all Redis instances, even if some Redis instances are not locked successfully at all.

  • Retry on failure

In a concurrent scenario, RedLock will have such a problem. For example, three threads acquire the lock of the same ticket at the same time. At this time, thread A has successfully added locks to redis-1 and reids-2, and thread B has successfully added locks to redis-1 and reids-2. Locks were added to redis-3 and reids-4, and thread C successfully added locks to reids-5. At this time, when the three threads went to lock again, there were no more machines to add, and it was found that the number of successful locks was less than half. , then the client will never be able to obtain the lock.

90fe233d97c4b8d4ccdedf9fc7d7c8a3.png

When the client cannot obtain the lock, it should randomly delay for a certain period of time and then retry to prevent multiple clients from grabbing the lock of the same resource at the same time.

  • release lock

Releasing the lock is relatively simple. Just send the release lock command to all Redis instances. You don’t have to worry about whether the lock has been successfully obtained from the Redis instance before.

After understanding RedLock, let’s finally transform the code logic for purchasing tickets. First, you need to define the corresponding Bean instance according to the number of redis instances. There must be at least three redis instances.

@Bean
public RedissonClient redissonClient() {
    Config config = new Config();
    // Stand-alone configuration
    config.useSingleServer().setAddress("redis://192.168.36.128:3306").setDatabase(0);
    return Redisson.create(config);
}

@Bean
public RedissonClient redissonClient2() {
    Config config = new Config();
    // Stand-alone configuration
    config.useSingleServer().setAddress("redis://192.168.36.130:3306").setDatabase(0);
    return Redisson.create(config);
}

@Bean
public RedissonClient redissonClient3() {
    Config config = new Config();
    // Stand-alone configuration
    config.useSingleServer().setAddress("redis://192.168.36.131:3306").setDatabase(0);
    return Redisson.create(config);
}

After the configuration is completed, set the same lock for each instance, and finally call tryLock and unlock provided by RedissonRedLock to lock and unlock.

void buyTicket(){
    RLock lock = redissonClient.getLock("lock");
    RLock lock2 = redissonClient2.getLock("lock");
    RLock lock3 = redissonClient3.getLock("lock");
    RedissonRedLock redLock = new RedissonRedLock(lock,lock2,lock3); // Lock three instances respectively
    if (redLock.tryLock()) {
        try {
            int stockNum = byTicketMapper.selectStockNum();
            if (stockNum > 0) {
                //TODO ticket buying process....
                byTicketMapper.reduceStock(); // Reduce inventory
            } else {
                log.info("======>Tickets sold out<====");
            }
        } finally {
            redLock.unlock(); //Release the lock
        }
    } else {
        log.info("======>The system is busy, please wait!<====");
    }
}

Summary

When using reids for distributed locks, it is not as simple as imagined. In high concurrency scenarios, deadlocks are prone to occur, locks are accidentally deleted by other threads, lock renewals, locks are lost, etc. These issues should be taken into consideration in actual development. problems and solve them according to corresponding solutions to ensure the security of the system. There may be some omissions or errors in this article, and we will continue to follow up.

Source: juejin.cn/post/7174312109017137165

Back-end exclusive technology group

To build a high-quality technical exchange community, HR personnel engaged in programming development and technical recruitment are welcome to join the group. Everyone is also welcome to share their own company’s internal information, help each other and make progress together!

Speak civilly, use exchange technology, and lead recommendations for positionsindustry discussions

Advertisers are not allowed to enter. Please trust your contacts in private messages to prevent being deceived.

630421275ced2b1aeb68ffc4a6f90f68.png

Add me as a friend and bring you into the group
The knowledge points of the article match the official knowledge files, and you can further learn relevant knowledge. Cloud native entry-level skills treeHomepageOverview 17073 people are learning the system