[MySQL Transactions] Multi-version Concurrency Control (MVCC)

Multi-version Concurrency Control (MVCC)

Article directory

  • Multi-version Concurrency Control (MVCC)
    • 1 Overview
    • 2. Snapshot reading and current reading
      • 2.1 Snapshot reading
      • 2.2 Current reading
    • 3. MVCC implementation principle of ReadView
      • 3.1 ReadView Overview
      • 3.2 Design ideas
      • 3.3 ReadView rules
      • 3.4 MVCC overall operation process
    • 4. Give examples
      • 4.1 READ COMMITTED isolation level
      • 4.2 REPEATABLE READ isolation level
    • 5. Summary

1. Overview

MVCC (Multiversion Concurrency Control), multi-version concurrency control. As the name suggests, MVCC implements database concurrency control through multiple version management of data rows. This technology ensures that consistent read operations are performed under InnoDB’s transaction isolation level. In other words, it is to query some rows that are being updated by another transaction, and you can see the values before they were updated, so that you do not have to wait for another transaction to release the lock when doing the query.

2. Snapshot reading and current reading

The implementation of MVCC in MySQL InnoDB is mainly to improve database concurrency performance and use a better way to handle read-write conflicts, so that even when there are read-write conflicts, No locking, non-blocking concurrent reading, and this reading refers to snapshot reading, not current reading. The current read is actually a locking operation and an implementation of pessimistic locking. The essence of MVCC is a way of using optimistic locking thinking.

2.1 Snapshot reading

Snapshot read, also called consistent read, reads snapshot data. Simple SELECTs without locks are snapshot reads, that is, non-blocking reads without locks; for example:

SELECT * FROM player WHERE ...

The premise of snapshot reading is that the isolation level is not the serial level. Snapshot reading at the serial level will degenerate into current reading.

2.2 Current Reading

The current read is the latest version of the record (the latest data, not the historical version of the data). When reading, it must also ensure that other concurrent transactions cannot modify the current record, and the read record will be locked. A locked SELECT, or addition, deletion, or modification of data will result in current reading. for example:

SELECT * FROM student LOCK IN SHARE MODE; # Shared lock
SELECT * FROM student FOR UPDATE; # Exclusive lock
INSERT INTO student values ... # Exclusive lock
DELETE FROM student WHERE ... # Exclusive lock
UPDATE student SET ... # Exclusive lock

3. MVCC implementation principle of ReadView

The implementation of MVCC relies on: Hidden Field, Undo Log, Read View.

3.1 ReadView Overview

In the MVCC mechanism, multiple transactions updating the same row record will generate multiple historical snapshots, and these historical snapshots are saved in the Undo Log. If a transaction wants to query this row record, which version of the row record needs to be read? At this time, ReadView needs to be used, which helps us solve the row visibility problem.

ReadView is the read view generated by transaction A when using the MVCC mechanism to perform snapshot read operations. When a transaction is started, a current snapshot of the database system will be generated. InnoDB constructs an array for each transaction to record and maintain the ID of the current active transaction of the system (“active” refers to, Started but not submitted yet).

3.2 Design ideas

For transactions using the READ UNCOMMITTED isolation level, since records modified by uncommitted transactions can be read, it is enough to directly read the latest version of the record.

For transactions using the SERIALIZABLE isolation level, InnoDB requires locking to access records.

Transactions using the READ COMMITTED and REPEATABLE READ isolation levels must ensure that records modified by the committed transaction are read. If another transaction has modified the record but has not yet submitted it, it cannot directly read the latest version of the record. The core problem is to judge which version in the version chain is visible to the current transaction. This is ReadView The main problem to be solved.

ReadView mainly contains 4 important contents, which are as follows:

  1. creator_trx_id , the transaction ID that created this Read View

    A transaction ID will be assigned to a transaction only when changes are made to records in the table (when INSERT, DELETE, and UPDATE statements are executed). Otherwise, the transaction ID value in a read-only transaction defaults to 0.

  2. trx_ids, which represents the transaction id list of active read and write transactions in the current system when the ReadView is generated.

  3. up_limit_id , the smallest transaction ID among active transactions

  4. low_limit_id, indicates the id value that should be assigned to the next transaction in the system when generating ReadView. low_limit_id is the maximum transaction ID value of the system. It should be noted here that it is the transaction ID in the system and needs to be distinguished from the active transaction ID.

    low_limit_id is not the maximum value in trx_ids, transaction ids are allocated incrementally. For example, there are now three transactions with IDs 1, 2, and 3, and then the transaction with ID 3 is submitted. Then when a new read transaction generates ReadView, trx_ids includes 1 and 2, the value of up_limit_id is 1, and the value of low_limit_id is 4.

3.3 ReadView Rules

With this ReadView, when accessing a record, you only need to follow the steps below to determine whether a certain version of the record is visible.

  1. If the value of the trx_id attribute of the accessed version is the same as the value of creator_trx_id in ReadView, it means that the current transaction is accessing its own modified records, so this version can be accessed by the current transaction.
  2. If the value of the trx_id attribute of the accessed version is less than the up_limit_id value in ReadView, it indicates that the transaction that generated this version has been committed before the current transaction generates ReadView, so this version can be accessed by the current transaction.
  3. If the value of the trx_id attribute of the accessed version is greater than or equal to the low_limit_id value in ReadView, it means that the transaction that generated this version was started after the current transaction generated ReadView, so this version cannot be accessed by the current transaction.
  4. If the trx_id attribute value of the accessed version is between ReadView’s up_limit_id and low_limit_id, then you need to determine whether the trx_id attribute value is in the trx_ids list.
    • If it is, it means that the transaction that generated this version was still active when the ReadView was created, and this version cannot be accessed.
    • If not, it means that the transaction that generated this version when the ReadView was created has been committed and this version can be accessed.

3.4 MVCC overall operation process

When querying a record, how does the system find it through MVCC:

  1. First, get the version number of the transaction itself, which is the transaction ID;
  2. GetReadView;
  3. Query the data obtained and then compare it with the transaction version number in ReadView;
  4. If the ReadView rules are not met, you need to obtain a historical snapshot from the Undo Log;
  5. Finally, data that conforms to the rules is returned.

If the last version in the version chain is also invisible, the record is completely invisible to the transaction and the query result does not include the record.

When the isolation level is Read Committed (Read Committed), each SELECT query in a transaction will re-obtain a Read View.

Transaction Description
begin;
select * from student where id >2; Get a Read View
select * from student where id >2; Get a Read View
commit;

At this time, the same query statement will obtain the Read View again. If the Read View is different, non-repeatable reads or phantom reads may occur.

When the isolation level is Repeatable Read, non-repeatable reads are avoided. This is because a transaction will only obtain a Read View once during the first SELECT, and all subsequent SELECTs will be repeated. Use this Read View as shown in the following table:

4. Examples

4.1 READ COMMITTED isolation level

READ COMMITTED: Generate a ReadView every time before reading data.

There are now two transactions with transaction ID 10 and 20 being executed:

# Transaction 10
BEGIN;
UPDATE student SET name="李思" WHERE id=1;
UPDATE student SET name="王五" WHERE id=1;
#Transaction 20
BEGIN;
# Updated some records in other tables
...

At this moment, the version linked list obtained by the record whose id is 1 in the student table is as follows:

Assume that a transaction using the READ COMMITTED isolation level is started:

# Use READ COMMITTED isolation level transactions
BEGIN;
# SELECT1: Transaction 10 and 20 are not submitted
SELECT * FROM student WHERE id = 1; # The value of the column name obtained is 'Zhang San'

After that, we submit the transaction with transaction id as 10:

# Transaction 10
BEGIN;
UPDATE student SET name="李思" WHERE id=1;
UPDATE student SET name="王五" WHERE id=1;
COMMIT;

Then go to the transaction where the transaction id is 20 and update the record where the id is 1 in the table student:

# Transaction 20
BEGIN;
# Updated some records in other tables
...
UPDATE student SET name="Qian Qi" WHERE id=1;
UPDATE student SET name="Song Ba" WHERE id=1;

At this moment, the version chain of the record with id as 1 in table student looks like this:

Then continue to search for the record with id 1 in the transaction that just used the READ COMMITTED isolation level, as follows:

# Use READ COMMITTED isolation level transactions
BEGIN;
# SELECT1: Neither Transaction 10 nor 20 was submitted.
SELECT * FROM student WHERE id = 1; # The value of the column name obtained is 'Zhang San'
# SELECT2: Transaction 10 submitted, Transaction 20 not submitted
SELECT * FROM student WHERE id = 1; # The value of the column name obtained is '王五'

4.2 REPEATABLE READ isolation level

For transactions using the REPEATABLE READ isolation level, a ReadView will only be generated when the query statement is executed for the first time, and subsequent queries will not generate it repeatedly.

For example, there are two transactions in the system with transaction ID 10 and 20 being executed:

# Transaction 10
BEGIN;
UPDATE student SET name="李思" WHERE id=1;
UPDATE student SET name="王五" WHERE id=1;
#Transaction 20
BEGIN;
# Updated some records in other tables
...

At this moment, the version linked list obtained by the record whose id is 1 in the student table is as follows:

Assume that a transaction using the REPEATABLE READ isolation level is started:

# Use REPEATABLE READ isolation level transactions
BEGIN;
# SELECT1: Transaction 10 and 20 are not submitted
SELECT * FROM student WHERE id = 1; # The value of the column name obtained is 'Zhang San'

After that, we submit the transaction with transaction id as 10, like this:

# Transaction 10
BEGIN;
UPDATE student SET name="李思" WHERE id=1;
UPDATE student SET name="王五" WHERE id=1;
COMMIT;

Then go to the transaction where the transaction id is 20 and update the record where the id is 1 in the table student:

# Transaction 20
BEGIN;
# Updated some records in other tables
...
UPDATE student SET name="Qian Qi" WHERE id=1;
UPDATE student SET name="Song Ba" WHERE id=1;

At this moment, the version chain length of the record whose id is 1 in table student is as follows:

Then continue to search for the record with id as 1 in the transaction that just used the REPEATABLE READ isolation level, as follows:

# Use REPEATABLE READ isolation level transactions
BEGIN;
# SELECT1: Neither Transaction 10 nor 20 was submitted.
SELECT * FROM student WHERE id = 1; # The value of the column name obtained is 'Zhang San'
# SELECT2: Transaction 10 submitted, Transaction 20 not submitted
SELECT * FROM student WHERE id = 1; # The value of the obtained column name is still 'Zhang San'

5. Summary

Here is an introduction to how transactions with the two isolation levels of MVCC, READ COMMITTD and REPEATABLE READ, access the recorded version chain when performing snapshot read operations. process. This allows the read-write and write-read operations of different transactions to be executed concurrently, thereby improving system performance.

The core point lies in the principle of ReadView. A big difference between the two isolation levels of READ COMMITTD and REPEATABLE READ is the different timing of generating ReadView:

  • READ COMMITTD will generate a ReadView before each normal SELECT operation.
  • REPEATABLE READ Only generates a ReadView before the first normal SELECT operation, and reuses this ReadView for subsequent query operations.