Do you know Online DDL?

Table of Contents

What is Online DDL?

What problems does Online DDL solve?

Classification

Copy

Prepare

Execute

Commit

copy algorithm process

Inplace

Prepare

Execute

Commit

inplace algorithm process

Instant

Use third-party tools to implement Online DDL

gh-ost

pt-osc

Some additions

Summary

Online DDL syntax

What is Online DDL?

Online DDL (Online Data Definition Language) refers to performing data definition language (DDL) operations while the database is running, such as creating, modifying, or deleting table structures, indexes, etc., without causing the database to be locked for a long time or become unusable. Traditional DDL operations usually require exclusive locking of the affected table, which may prevent other sessions from reading and writing to the table, thus affecting the normal use of the database.

What problems does Online DDL solve?

Reduce system downtime: Traditional DDL operations usually require locking the affected tables, which may cause the database to be unable to provide external services during the DDL operation, causing system downtime. . Online DDL can perform table structure changes without interrupting normal services, thereby reducing system downtime and improving system availability.
Improving concurrency performance: Through online DDL, the database system can allow DDL operations to be executed concurrently with other transactions, reducing the locking time of the table and improving the performance of the database. Concurrency performance allows users to continue data operations without being affected.
Reduce business risks: Online DDL can avoid business interruptions and failures caused by long-term table locking. This is particularly important for businesses that have high requirements for database continuity and availability, and can reduce business risks.
Improving management efficiency: Online DDL allows database administrators to adjust the table structure without downtime, thereby improving management efficiency and reducing Impact on business.
Resolving lock wait: In traditional DDL operations, the database usually requires an exclusive lock on the affected table, which may cause other sessions to be unable to The table performs read and write operations, resulting in lock waiting. Online DDL can effectively solve the lock waiting problem that may be caused by traditional DDL operations.

In short, by solving the above problems, online DDL improves the availability, concurrency and management efficiency of the database system, enables the database to be structurally adjusted without interrupting services, and adapts to the modern business requirements for database continuity and stability.

Classification

There are three mainstream algorithms currently supported:

COPY – Before MySQL 5.6 and not Online, this algorithm was executed
INPLACE – Appeared in MySQL 5.6
INSTANT – Appeared in MySQL 8.0.12

At this time, there are doubts. So many algorithms are too big.

We know that it takes three stages to put something into the refrigerator, and during DDL operation, no matter what algorithm is executed, it will also go through three stages, the preparation stage [prepare], the execution stage [DDL], and the submission stage. 【commit】. The difference is that different processing is done in the three stages. Next, we will carefully analyze these three stages of different algorithms.

Copy

Copy algorithm refers to creating a new table, copying the data of the original table to the new table, then modifying the structure of the new table, and finally deleting the original table. The advantage of this algorithm is that the read and write operations of the original table will not be affected during the modification process, but it requires more space and time.
Applicable scenarios: Suitable for situations where a large number of structural modifications need to be made, or the size of the table is not particularly large.

The following are the three stages of the copy algorithm

Prepare

Add metadata shared lock to the table and read frm metadata (DDL cannot be executed concurrently at this time, but DML can be executed concurrently)
The shared lock is upgraded to an exclusive lock (at this time, neither DDL nor DML can be executed concurrently)
Create a temporary table through the create like statement in the server layer, and the engine layer also generates the corresponding ibd and frm files (there is no .frm file after 8.0)

Execute

Modify temporary table metadata (add columns)
Copy metadata to the temporary table [The most time-consuming, the data in the table needs to be copied row by row]
Delete original tables and files
Rename temporary tables and files

Commit

Commit transaction and release lock

copy algorithm process

Inplace

Inplace algorithm refers to making structural modifications directly on the original table without creating a new table. The advantage of this algorithm is that it does not require additional storage space, but may incur larger locks, blocking and performance overhead.
Applicable scenarios: Suitable for small-scale structural modifications to the table, or when the table is large but has enough free space.

For the “Inplace” algorithm, its core idea is to directly modify the structure on the original table without creating a new table. However, in an actual database system, even if the “Inplace” algorithm is used, temporary data structures may need to be created to complete modification operations. These temporary data structures are usually not completely independent new tables, but are used as auxiliary tables. Modify the data structure of the operation. Therefore, strictly speaking, even using the “Inplace” algorithm may involve the creation of temporary data structures.

Prepare

Add a metadata shared upgrade lock to the table and upgrade it to an exclusive lock (DML cannot be parallelized at this time)
Judgment using inplace algorithm
Determine whether the statement is “rebulid table” or “no-rebuild”
Apply for row log space to store DML operations generated during the DDL execution phase. (not required for no-rebuild) [in innodb_sort_buffer block]

So how does the inplace algorithm determine whether a table needs to be rebuilt?

The inplace algorithm usually considers the following factors when determining whether to rebuild a new table:

Size of the table: If the size of the table is small, performing “Inplace” modifications may be faster and not have a large impact on the overall performance of the database. Therefore, structural modifications to small tables usually do not require rebuilding.

Available space: The “Inplace” algorithm also takes into account the degree of fragmentation of the table and the amount of free space. If the table is severely fragmented, or there is enough continuous free space in the table, performing “Inplace” modifications may go smoothly without rebuilding the table.

Modification type: Some specific types of structural modifications may not be processed by the “Inplace” algorithm, such as those involving large changes to the physical structure of the table. Changes to the situation. In this case, the “Inplace” algorithm may determine that the modification needs to be completed by rebuilding the table.

Execute

Release the exclusive lock and retain the metadata shared upgrade lock (DML can be parallelized at this time)
Scan all data pages of the original table’s primary key and secondary index, generate a B + Tree, and store it in a temporary file; [Scanning in the engine is the most time-consuming】
Record all DML operations on the original table in the log file row log, and play back part of the row_log

Commit

Upgrade the metadata shared upgrade lock and generate an exclusive lock table; (DDL cannot be parallelized at this time)
Rework the contents in the row log; (not required for no-rebuild)
Rename the original table file, change the temporary file name to the original table file name, and delete the original table file
Submit the transaction and the change is completed

inplace algorithm process

Precautions:

The data generated during DDL will be written to the original table according to normal operations, recording redolog, undolog, and binlog, and synchronized to the slave database for execution, except that it will be additionally recorded in the row log. And the operation of writing row log itself will also record redolog

The row log is redone during the submission phase. The table will be locked at this stage. At this time, the data of the main database (new table space + row log) and the slave database (table space) are Consistently, after the main library DDL operation is completed and submitted, the DDL will be written to the binlog and passed to the slave library for execution. When the slave library executes the DDL, the DDL is still online for the slave library locally, that is, in the slave library. Directly writing data locally from the slave library will not block, and row logs will be generated just like the main library.

But when the main library synchronizes DML, it will be blocked at this time and is offline. DDL is exclusively locked and the same is true in the replication thread, so it will not only block the table, but It is all subsequent operations synchronized from the main library (mainly exclusive when the replication thread is parallel, and only he is executing at the same time). Therefore, DDL operations on large tables will cause synchronization delays.

It is worth thinking about that although CRUD cannot be performed in two of the three stages (preparation, execution, submission), in fact, the execution stage takes the longest time in the entire DDL, for example, 30 minutes For DDL, the preparation + submission phase only takes 1 minute, and the remaining 29 minutes are executed. Therefore, for the business layer, most of the time it can be accessed normally, so Oline DDL is achieved.

So where do we need to insert a large amount of data within this minute?

Then let’s take a look at the algorithm below.

Instant

Instant algorithm is a special algorithm. It only needs to modify the metadata in the data dictionary. There is no need to copy the data or rebuild the entire table. Similarly, there is no need to add exclusive MDL locks and the original table data does not need to be added. Affected. The entire DDL process is completed almost instantly and does not block DML.
Applicable scenarios: Suitable for situations where a very small range of structural modifications are made to the table, or minimal impact on database performance is required.

This new feature was introduced in 8.0.12 (contributed by Tencent DBA team). When performing a DDL operation, the ALGORITHM option does not need to be specified. At this time, MySQL automatically selects the appropriate mode in the order of INSTANT, INPLACE, and COPY. You can also specify ALGORITHM=DEFAULT, which has the same effect. If the ALGORITHM option is specified but is not supported, an error will be reported directly.

The new algorithm proposed in MySQL 8.0.12 currently only supports a few operations such as adding columns (it is not too mature yet, and enterprises generally use versions 5.6 and 5.7). Using the new table structure design of 8.0, the metadata data of the table can be directly modified. , eliminating the rebuild process and greatly shortening the execution time of DDL statements.

Use third-party tools to implement Online DDL

There are some third-party tools that can also implement DDL operations. The most common ones are percona’s pt-online-schema-change tool (abbreviated as pt-osc) and github’s gh-ost tool, both of which support MySQL 5.5 or above.

gh-ost

Reference article: MySQL best practices: detailed explanation of the use of gh-ost tool

pt-osc

Reference article: Guide to using PT-OSC online DDL change tool

Some additions

So does the DDL that supports the Inplace algorithm have to be online?

Conceptually, inplace and online are two different dimensions.
Copy and inplace refer to the execution algorithm inside DDL, which can be understood as: copy is an operation at the server layer, and inplace is an operation at the innodb layer.
In fact, for users, whether they care about online or not is usually related to one question: whether to allow concurrent DML
Using a classic philosophical term: DDL executed by the copy algorithm is not online, and DDL executed by the inplace algorithm is not necessarily online.

Summary

Commonly used DDL execution methods:

Please refer to the algorithm used by DDL statements, whether concurrent DML is allowed, whether tables need to be rebuilt, etc.

Online DDL syntax

ALTER TABLE tbl_name ADD COLUMN col_name col_type, ALGORITHM=INPLACE, LOCK=NONE;

Among them, ALGORITHM has the following options:

INPLACE: Replacement: Perform DDL operations directly on the original table.
COPY: Copy: Use a temporary table to clone a temporary table, execute DDL on the temporary table, and then import the data into Temporary tables, renaming etc. During this period, twice as much disk space is required to support such operations. During execution, DML operations are not allowed on the table.
INSTSNT:
DEFAULT: The default method is chosen by MySQL itself, and the INPLACE method is preferred.

LOCK has the following options:

SHARE: Shared lock, the table executing DDL can be read, but cannot be written.
NONE: There is no restriction, the table executing DDL can be read and written
EXCLUSIVE: Exclusive lock, the table executing DDL cannot be read or written.
DEFAULT: Default value, which is the default value used when the LOCK clause is not specified in the DDL statement. If the value of LOCK is specified as DEFAULT, it is left to the MySQL clause to determine whether to lock or not lock the table. Not recommended. If you are sure that your DDL statement will not lock the table, you can not specify lock or specify its value as default. Otherwise, it is recommended to specify its lock type.

When performing a DDL operation, the ALGORITHM option does not need to be specified. At this time, MySQL automatically selects the appropriate mode in the order of INSTANT, INPLACE, and COPY. You can also specify ALGORITHM=DEFAULT, which has the same effect. If the ALGORITHM option is specified but is not supported, an error will be reported directly.