Multi-version Concurrency Control (MVCC)
Article directory
- Multi-version Concurrency Control (MVCC)
-
- 1 Overview
- 2. Snapshot reading and current reading
-
- 2.1 Snapshot reading
- 2.2 Current reading
- 3. MVCC implementation principle of ReadView
-
- 3.1 ReadView Overview
- 3.2 Design ideas
- 3.3 ReadView rules
- 3.4 MVCC overall operation process
- 4. Give examples
-
- 4.1 READ COMMITTED isolation level
- 4.2 REPEATABLE READ isolation level
- 5. Summary
1. Overview
MVCC (Multiversion Concurrency Control), multi-version concurrency control. As the name suggests, MVCC implements database concurrency control
through multiple version management of data rows. This technology ensures that consistent read
operations are performed under InnoDB’s transaction isolation level. In other words, it is to query some rows that are being updated by another transaction, and you can see the values before they were updated, so that you do not have to wait for another transaction to release the lock when doing the query.
2. Snapshot reading and current reading
The implementation of MVCC in MySQL InnoDB is mainly to improve database concurrency performance and use a better way to handle read-write conflicts
, so that even when there are read-write conflicts, No locking
, non-blocking concurrent reading, and this reading refers to snapshot reading
, not current reading
. The current read is actually a locking operation and an implementation of pessimistic locking. The essence of MVCC is a way of using optimistic locking thinking.
2.1 Snapshot reading
Snapshot read, also called consistent read, reads snapshot data. Simple SELECTs without locks are snapshot reads, that is, non-blocking reads without locks; for example:
SELECT * FROM player WHERE ...
The premise of snapshot reading is that the isolation level is not the serial level. Snapshot reading at the serial level will degenerate into current reading.
2.2 Current Reading
The current read is the latest version of the record (the latest data, not the historical version of the data). When reading, it must also ensure that other concurrent transactions cannot modify the current record, and the read record will be locked. A locked SELECT, or addition, deletion, or modification of data will result in current reading. for example:
SELECT * FROM student LOCK IN SHARE MODE; # Shared lock SELECT * FROM student FOR UPDATE; # Exclusive lock INSERT INTO student values ... # Exclusive lock DELETE FROM student WHERE ... # Exclusive lock UPDATE student SET ... # Exclusive lock
3. MVCC implementation principle of ReadView
The implementation of MVCC relies on: Hidden Field, Undo Log, Read View.
3.1 ReadView Overview
In the MVCC mechanism, multiple transactions updating the same row record will generate multiple historical snapshots, and these historical snapshots are saved in the Undo Log. If a transaction wants to query this row record, which version of the row record needs to be read? At this time, ReadView needs to be used, which helps us solve the row visibility problem.
ReadView is the read view generated by transaction A when using the MVCC mechanism to perform snapshot read operations. When a transaction is started, a current snapshot of the database system will be generated. InnoDB constructs an array for each transaction to record and maintain the ID of the current active transaction
of the system (“active” refers to, Started but not submitted yet).
3.2 Design ideas
For transactions using the READ UNCOMMITTED
isolation level, since records modified by uncommitted transactions can be read, it is enough to directly read the latest version of the record.
For transactions using the SERIALIZABLE
isolation level, InnoDB requires locking to access records.
Transactions using the READ COMMITTED
and REPEATABLE READ
isolation levels must ensure that records modified by the committed
transaction are read. If another transaction has modified the record but has not yet submitted it, it cannot directly read the latest version of the record. The core problem is to judge which version in the version chain is visible to the current transaction. This is ReadView The main problem to be solved.
ReadView mainly contains 4 important contents, which are as follows:
-
creator_trx_id
, the transaction ID that created this Read ViewA transaction ID will be assigned to a transaction only when changes are made to records in the table (when INSERT, DELETE, and UPDATE statements are executed). Otherwise, the transaction ID value in a read-only transaction defaults to 0.
-
trx_ids
, which represents thetransaction id list
of active read and write transactions in the current system when the ReadView is generated. -
up_limit_id
, the smallest transaction ID among active transactions -
low_limit_id
, indicates the id value that should be assigned to the next transaction in the system when generating ReadView. low_limit_id is the maximum transaction ID value of the system. It should be noted here that it is the transaction ID in the system and needs to be distinguished from the active transaction ID.low_limit_id is not the maximum value in trx_ids, transaction ids are allocated incrementally. For example, there are now three transactions with IDs 1, 2, and 3, and then the transaction with ID 3 is submitted. Then when a new read transaction generates ReadView, trx_ids includes 1 and 2, the value of up_limit_id is 1, and the value of low_limit_id is 4.
3.3 ReadView Rules
With this ReadView, when accessing a record, you only need to follow the steps below to determine whether a certain version of the record is visible.
- If the value of the trx_id attribute of the accessed version is the same as the value of creator_trx_id in ReadView, it means that the current transaction is accessing its own modified records, so this version can be accessed by the current transaction.
- If the value of the trx_id attribute of the accessed version is less than the up_limit_id value in ReadView, it indicates that the transaction that generated this version has been committed before the current transaction generates ReadView, so this version can be accessed by the current transaction.
- If the value of the trx_id attribute of the accessed version is greater than or equal to the low_limit_id value in ReadView, it means that the transaction that generated this version was started after the current transaction generated ReadView, so this version cannot be accessed by the current transaction.
- If the trx_id attribute value of the accessed version is between ReadView’s up_limit_id and low_limit_id, then you need to determine whether the trx_id attribute value is in the trx_ids list.
- If it is, it means that the transaction that generated this version was still active when the ReadView was created, and this version cannot be accessed.
- If not, it means that the transaction that generated this version when the ReadView was created has been committed and this version can be accessed.
3.4 MVCC overall operation process
When querying a record, how does the system find it through MVCC:
- First, get the version number of the transaction itself, which is the transaction ID;
- GetReadView;
- Query the data obtained and then compare it with the transaction version number in ReadView;
- If the ReadView rules are not met, you need to obtain a historical snapshot from the Undo Log;
- Finally, data that conforms to the rules is returned.
If the last version in the version chain is also invisible, the record is completely invisible to the transaction and the query result does not include the record.
When the isolation level is Read Committed
(Read Committed), each SELECT query in a transaction will re-obtain a Read View.
Transaction | Description |
---|---|
begin; | |
select * from student where id >2; | Get a Read View |
… | |
select * from student where id >2; | Get a Read View |
commit; |
At this time, the same query statement will obtain the Read View again. If the Read View is different, non-repeatable reads or phantom reads may occur.
When the isolation level is Repeatable Read
, non-repeatable reads are avoided. This is because a transaction will only obtain a Read View once during the first SELECT, and all subsequent SELECTs will be repeated. Use this Read View as shown in the following table:
4. Examples
4.1 READ COMMITTED isolation level
READ COMMITTED
: Generate a ReadView every time before reading data.
There are now two transactions with transaction ID
10
and 20
being executed:
# Transaction 10 BEGIN; UPDATE student SET name="李思" WHERE id=1; UPDATE student SET name="王五" WHERE id=1; #Transaction 20 BEGIN; # Updated some records in other tables ...
At this moment, the version linked list obtained by the record whose id
is 1
in the student table is as follows:
Assume that a transaction using the READ COMMITTED
isolation level is started:
# Use READ COMMITTED isolation level transactions BEGIN; # SELECT1: Transaction 10 and 20 are not submitted SELECT * FROM student WHERE id = 1; # The value of the column name obtained is 'Zhang San'
After that, we submit the transaction with transaction id
as 10
:
# Transaction 10 BEGIN; UPDATE student SET name="李思" WHERE id=1; UPDATE student SET name="王五" WHERE id=1; COMMIT;
Then go to the transaction where the transaction id
is 20
and update the record where the id
is 1
in the table student:
# Transaction 20 BEGIN; # Updated some records in other tables ... UPDATE student SET name="Qian Qi" WHERE id=1; UPDATE student SET name="Song Ba" WHERE id=1;
At this moment, the version chain of the record with id
as 1
in table student looks like this:
Then continue to search for the record with id 1 in the transaction that just used the READ COMMITTED
isolation level, as follows:
# Use READ COMMITTED isolation level transactions BEGIN; # SELECT1: Neither Transaction 10 nor 20 was submitted. SELECT * FROM student WHERE id = 1; # The value of the column name obtained is 'Zhang San' # SELECT2: Transaction 10 submitted, Transaction 20 not submitted SELECT * FROM student WHERE id = 1; # The value of the column name obtained is '王五'
4.2 REPEATABLE READ isolation level
For transactions using the REPEATABLE READ
isolation level, a ReadView will only be generated when the query statement is executed for the first time, and subsequent queries will not generate it repeatedly.
For example, there are two transactions in the system with transaction ID
10
and 20
being executed:
# Transaction 10 BEGIN; UPDATE student SET name="李思" WHERE id=1; UPDATE student SET name="王五" WHERE id=1; #Transaction 20 BEGIN; # Updated some records in other tables ...
At this moment, the version linked list obtained by the record whose id
is 1
in the student table is as follows:
Assume that a transaction using the REPEATABLE READ
isolation level is started:
# Use REPEATABLE READ isolation level transactions BEGIN; # SELECT1: Transaction 10 and 20 are not submitted SELECT * FROM student WHERE id = 1; # The value of the column name obtained is 'Zhang San'
After that, we submit the transaction with transaction id
as 10
, like this:
# Transaction 10 BEGIN; UPDATE student SET name="李思" WHERE id=1; UPDATE student SET name="王五" WHERE id=1; COMMIT;
Then go to the transaction where the transaction id
is 20
and update the record where the id
is 1
in the table student:
# Transaction 20 BEGIN; # Updated some records in other tables ... UPDATE student SET name="Qian Qi" WHERE id=1; UPDATE student SET name="Song Ba" WHERE id=1;
At this moment, the version chain length of the record whose id
is 1
in table student is as follows:
Then continue to search for the record with id
as 1
in the transaction that just used the REPEATABLE READ
isolation level, as follows:
# Use REPEATABLE READ isolation level transactions BEGIN; # SELECT1: Neither Transaction 10 nor 20 was submitted. SELECT * FROM student WHERE id = 1; # The value of the column name obtained is 'Zhang San' # SELECT2: Transaction 10 submitted, Transaction 20 not submitted SELECT * FROM student WHERE id = 1; # The value of the obtained column name is still 'Zhang San'
5. Summary
Here is an introduction to how transactions with the two isolation levels of MVCC
, READ COMMITTD
and REPEATABLE READ
, access the recorded version chain when performing snapshot read operations. process. This allows the read-write
and write-read
operations of different transactions to be executed concurrently, thereby improving system performance.
The core point lies in the principle of ReadView. A big difference between the two isolation levels of READ COMMITTD
and REPEATABLE READ
is the different timing of generating ReadView:
READ COMMITTD
will generate a ReadView before each normal SELECT operation.REPEATABLE READ
Only generates a ReadView before the first normal SELECT operation, and reuses this ReadView for subsequent query operations.