Only changing one line of statement, so many locks?

Introduction

In this article, I want to talk about how MySQL locks are added. Why do I want to talk about this? Mainly because we use locks more or less in our business. After all, locks are the key to ensuring the security of our data. However, due to lack of understanding of the principle, we may often cause potential performance problems in “deliberate” or “unintentional” usage scenarios, which may lead to a reduction in processing capabilities, or may bring down our DB, so we need to Only by having a comprehensive understanding of the principles and usage scenarios of locks can we better control them and avoid unnecessary business risks.

Table of contents

1 When to add

2 How to add

3. When to add and when not to add?

We mainly discuss this issue from three aspects:

When will it be added?

How to add?

When should it be added and when should it not be added?

01

When to add

1.1 Display Lock

MySQL’s locking can be divided into explicit locking and implicit locking. Explicit locking is easier for us to identify because it is often directly reflected in SQL. Common explicit locking statements mainly include:

select … for update;

select … in share mode;

The difference between the two is that the former adds an exclusive lock, and the latter adds a shared lock. After adding an exclusive lock, subsequent writing and reading operations on the range of data will be blocked. Another shared lock will not block reading, but block writing, but this often causes some problems, such as e-commerce scenarios. When updating inventory, we often need to lock the product data first to ensure consistent data updates. If two threads update inventory concurrently at this time, it may cause data update abnormalities.

Therefore, we often use select … for update to lock data in business. There are also some less commonly used locking methods, such as:

Global lock: Flush tables with read lock, mainly used when performing logical backup

Table lock: lock tables … read/write

1.2 Implicit lock

Implicit locks are what we need to pay special attention to. Many “pits” are caused by the existence of implicit locks. Invisible locks are often the most fatal.

In addition to table locks, table-level locks also include metadata locks:

MDL read lock will be added when adding, deleting, modifying and checking;

When making changes to the table structure, an MDL write lock will be added;

The problem this will bring is that when we want to add indexes to the table or modify the table structure, due to the MDL write lock, it will block our normal online read and write requests. At this time, the upstream failure retry mechanism may be triggered. , then it is likely that there will be an avalanche of requests and the DB will be hung up.

The other thing is row locks and gap locks that are closely related to our daily business. When we are adding, deleting, or modifying, we will add row locks or gap locks according to the current isolation level. At this time, we need to pay attention to whether it will affect the reading and writing of normal business. Performance, another risk is that the lock range may be too large, blocking requests, and triggering upstream retries, causing service avalanche and DB hangs.

1.3 Will it be locked?

Speaking of this, some students may have questions. If your additions, deletions and modifications are locked, then the performance will be very poor when I read. Especially in business scenarios with many reads and many writes, when my read request comes in, DB Weren’t you checked by me every minute? In fact, the innodb engine here uses a mvcc technology, that is, multi-version concurrency control. The principle is to record the updated transaction ID and corresponding data in the undolog while the data is updated, and maintain an active transaction ID of Readview, so that when a transaction When executing, it is easy to know what data you can see and what data you cannot see. At this time, reading the data will naturally not be affected by the lock and can be read normally.

02

How to add

The discussion here about how to add locks is actually to understand the type and scope of locks, that is, what locks are used and where are they added? Before discussing this issue, let’s take a look at the transaction isolation level:

Read uncommitted;

Read submitted;

Repeatable reading;

Serialization;

Why do you say this? Because the isolation level also affects our locking, read committed solves the problem of dirty reads, but does not solve the problem of phantom reads; repeatable reads solve the problem of phantom reads by introducing gap locks, which means that different isolation levels are used The locks are not the same, but one thing is clear: the higher the isolation level, the more strict the use of locks. Repeatable read is the default transaction isolation level, but the isolation level set online is often read committed, mainly because this level is sufficient and can have better concurrency performance. The scope of our discussion next is mainly Read Committed (RC) and Repeatable Read (RR).

Here is a detailed analysis based on the corresponding rules:

Principle 1: The basic unit of locking is next-key lock. I hope you still remember that next-key lock is a front-open and back-close interval.

Principle 2: Only objects accessed during the search process will be locked.

Optimization 1: For equivalent queries on the index, when locking the unique index, the next-key lock degenerates into a row lock.

Optimization 2: When traversing to the right and the last value does not meet the equality condition for equivalent queries on the index, the next-key lock degenerates into a gap lock.

A bug: a range query on a unique index will access the first value that does not meet the condition.

There are two other points to note:

The lock is added to the index;

The gap lock is shared rather than exclusive.

2.1 RC

Next, we will discuss them separately, which may be a bit lengthy and require your patience to read them.

The first is the RC level. The locking rules at this level are relatively simple because they only involve row locks. First, we design a table.

CREATE TABLE `t_db_lock` (
  `id` int(11) NOT NULL,
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `a` (`a`)
) ENGINE=InnoDB;


insert into t_db_lock values(0,0,0),(5,5,5),(10,10,10);

2.2 Main equivalent exists

sessionA	sessionB	sessionC
begin;update t_db_lock set id=id + 1 where id = 0;
	insert into t_db_lock values(1,1,1) [block]
		update t_db_lock set id=id + 1 where id = 5;[success]

You can see that sessionA is updating the data on the primary key at this time, updating the primary key value of the current record to 1. At this time, the db will add row locks on id=1 and 0, that is, for the id at this time Updates will be blocked;

Therefore, when sessionB wants to insert the record with id=1, it will be blocked;

But since sessionC updates the record with id=5, it can be executed successfully.

2.3 Non-unique equivalents

sessionA	sessionB	sessionC
begin;update t_db_lock set b=b + 1 where a = 0;
	update t_db_lock set b=b + 1 where id = 0; [block]
		update t_db_lock set b=b + 1 where b = 0;[block]

sessionA updates data according to the judgment conditions of ordinary indexes. Since the row lock is added to the index, the index data related to column a is locked at this time;

But why is my update of the data with id=0 blocked at this time? Because at this time, in addition to adding the index on a, there is also a table update operation. The index on the primary key accessed at this time will also be locked. Because it is the same row, the update at this time is also blocked;

In the same way, when we update the data of b=0, the primary key index corresponding to it is the same piece of data, so the update is also blocked at this time, but if we update the data of b=5 at this time, can be updated successfully.

2.4 Primary key equivalent does not exist

sessionA	sessionB	sessionC
begin;select * from t_db_lock where id = 2 for update;
	update t_db_lock set b=b + 1 where a = 0; [success]
		update t_db_lock set b=b + 1 where b = 0;[success]

SessionA added a lock with id 2. At this time, this row record does not exist, and the row lock was not successfully added, so it will not block requests from other sessions;

sessionB was executed successfully;

sessionC was executed successfully.

2.5 No index equivalent value does not exist

sessionA	sessionB	sessionC
begin;select * from t_db_lock where b=3 for update;
	update t_db_lock set b=b + 1 where a = 0; [success]
		update t_db_lock set b=b + 1 where b = 5;[success]

This situation is inconsistent with the primary key and other values. Since the corresponding lock record is not found, subsequent update operations can be executed successfully.

2.6 Primary key range

sessionA	sessionB	sessionC
begin;select * from t_db_lock where id >= 0 and id <= 5 for update;
	update t_db_lock set b=b + 1 where a = 0; [block]
		insert into t_db_lock values(1,1,1) [success]

sessionA is locked based on the range, and the two rows of data id=0 and 5 are locked;

sessionB is blocked due to updating the locked data of the id=0 row;

sessionC Since the record with id=1 does not exist before, it can be inserted normally. Is this scene a bit familiar? It is what we call phantom reading. If you execute select * from t_db_lock where id > in sessionA at this time, = 0 and id <= 5, you will find one more piece of data;

2.7 RR

The repeatable read level here mainly discusses the gap lock locking scenario. This locking situation will be much more complicated than the read committed isolation level; set session transaction isolation level repeatable read.

2.8 Primary key equivalent values exist

sessionA	sessionB	sessionC
begin;select * from t_db_lock where id = 5 for update;
	insert into t_db_lock values(2,2,2); [success]
		insert into t_db_lock values(6,6,6); [success]

sessionA locks the existing id=5 row. According to the locking rules, the unique index will degenerate into a row lock, so it only locks the id=5 row. In fact, this is easy to understand, since it is already a unique index. , then there will be no phantom reading, so the phantom reading only depends on whether the row exists, so I only need to lock the row to ensure that it will no longer be written;

If sessionB and sessionC are not within the lock range, the insertion is successful.

2.9 Non-unique equivalents

sessionA	sessionB	sessionC
begin;select * from t_db_lock where a = 5 for update;
	insert into t_db_lock values(6,11,6); [block]
		insert into t_db_lock values(11,10,6); [success]

sessionA locks the existing a=5 row record. Since it is a non-unique index, according to the locking rules, first scan the a index and add next-key lock (0,5], and then traverse right to the first does not meet the conditions (according to rule 5, the range query on the unique index will access the first value that does not meet the conditions), and degenerates into a gap lock, so the locking range is (5,10), and the overall lock is The range is (0,10); and for update, a lock will be added to the index range of the primary key, that is, (0,10);

sessionB is within the lock range of the primary key index, so it is blocked;

sessionC is not in the range of ordinary index and primary key index at this time, so the execution is successful;

It can be seen here that for non-unique equal value queries, the locking range is larger than the primary key equivalent value, so we need to pay attention to this range when locking non-unique indexes.

2.10 Primary key equivalent does not exist

sessionA	sessionB	sessionC
begin;select * from t_db_lock where id = 3 for update;
	insert into t_db_lock values(2,2,2); [block]
		insert into t_db_lock values(6,6,6); [success]

sessionA has added a row lock to the record with id=3 at this time, but since the record of row 3 does not exist at this time, this range will be locked. According to the locking principle, traverse to the right and the last value does not satisfy Equivalent conditions, next-key lock degenerates into gap lock, and the lock range is (0,5) at this time;

sessionB is within the locking range and is therefore blocked;

sessionC is not within this locking range, and the locking is successful.

Why is a range lock added here? In fact, the main solution is to solve the phantom read problem. Assuming that there is no lock in this range, then sessionB is executed successfully at T1 time, and select * from t_db_lock where id = 3 is executed again at T2 time. , you will find that the results that were not queried before can now be queried, just like an illusion; in order to avoid this phantom reading, it is necessary to lock in this range.

2.11 Non-unique equivalent value does not exist

sessionA	sessionB	sessionC
begin;select * from t_db_lock where a = 3 for update;
	insert into t_db_lock values(3,5,5); [block]
		insert into t_db_lock values(6,5,5); [success]

sessionA locks the row a=3. Since this row does not exist in the db, next-key lock will also be added. And because the locks are all added to the index, the a will be added to the index. Range lock of (0,5). But there is a strange phenomenon here. When a=5, if id<5 it will block, if id>5 it will succeed. From the results, it seems that the lock on a is biased at this time and is not strict. In the sense of a=5, the corresponding inserted record will be locked.

2.12 Primary key range

sessionA	sessionB	sessionC
select * from t_db_lock where id >= 5 and id < 6 for update;
	insert into t_db_lock values(3,3,3); [success]
		insert into t_db_lock values(10,10,10); [block]

SessionA performs range query locking, which is semantically equivalent to select * from t_db_lock where id = 5 for update, but the actual locking situation is still very different. First, id >= 5 is queried according to the equivalent query. The row =5 is locked as (0,5]. Since it is a unique index, it degenerates into a row lock. Therefore, a lock is added on the row id=5. Then, query to the right to find the first value that does not meet the condition. That is, for the line id=10, add next-key lock(5,10]. Since this is not an equivalent query, there will be no degeneration into gap lock, so the overall lock range is [5,10];

sessionB is not within the lock range and the insertion is successful;

sessionC is in the lock and the insertion fails. Note that it is blocked instead of reporting a primary key conflict.

2.13 Non-unique range

sessionA	sessionB	sessionC
begin;select * from t_db_lock where a >= 5 and a < 6 for update;
	insert into t_db_lock values(3,3,3); [block]
		insert into t_db_lock values(10,10,10); [block]

The main difference between the sessionA locking range and the primary key index is that it does not degenerate into row locks in the range of (0, 5], so the overall locking range is (0, 10]

2.14 No index equivalent value does not exist

sessionA	sessionB	sessionC
begin;select * from t_db_lock where b = 6 for update;
	insert into t_db_lock values(3,3,3); [block]
		insert into t_db_lock values(10,10,10); [block]

The locked record in sessionA is b=6. Since b has not created an index, all records on the b index will be locked. Since it is locked for update, it is considered to be updated on the main table, so the main table The related records of the table are also locked, which will cause the table to be locked during the locking period, and any update operation will not be successful. This will be a very dangerous operation online and may cause the DB to be destroyed.

03

When should I add it and when should I not add it?

Through the above analysis, we should have a general understanding of the types of locks and the scope of locks in statements. We can know that pessimistic locks need to be used with caution, because it is likely that simple SQL will drag down the performance of the DB and affect the line. On the quality of service, when should it be added and when should it not be added?

I think for db concurrency scenarios, we can think about it this way:

Give priority to using optimistic locking as much as possible;

If you need to use pessimistic locking, you must add an index to the locked key;

Confirm the isolation level of the db, analyze possible causes of conflicts or deadlocks in SQL, and avoid SQL being blocked for a long time;

In fact, there is no silver bullet for the mutual exclusion scheme of db. Solutions must be formulated according to specific business scenarios. However, we can identify some pitfalls that may arise in advance, avoid low-level errors, and have the ability to optimize. He, this is a good way to keep improving yourself.

-End-

Original author｜Li Bin

Is there anything else you want to talk about about the MySQL lock mechanism? Comments welcome. We will select the most valuable comment and give away a QQ Family plush doll (see picture below). The lottery will be drawn at 12 noon on November 13th.

Welcome to join the Tencent Cloud Developer Community. Exclusive community coupons, celebrity communication circles, first-hand event notifications, and limited edition Goose Factory peripherals are waiting for you~

(Long press the picture to scan the code immediately)