Understanding MySQL logs redo, undo, binlog

Understanding MySQL log redo, undo, binlog

1. Problems solved by MySQL log files
2. redo log
- 2.1. Composition of redo log
- 2.2. Redo log flushing strategy
- 2.3. What problems does MySQL’s redo log solve?
3. undo log
- 3.1. Function of undo log
- 3.2. Types of undo log
- 3.3. Life cycle of undo log
- 3.4. Several hidden fields related to transaction rollback
4. binlog log
- 4.1. About binlog log
- 4.2. Three formats of binlog
- 4.3. Binlog flushing process
- 4.4. InnoDB two-stage submission
- 4.5. The difference between redo log and binlog
5. Extension: MySQL log type

1. Problems solved by MySQL log files

Transactions have 4 properties (CAID): atomicity, consistency, isolation, and durability.

Transaction isolation is achieved by the lock mechanism.
The atomicity, consistency, and durability of transactions are guaranteed by the redo log and undo log of the transaction.
- redo log is called redo log, which provides rewrite operations and restores page operations modified by submitted transactions. is used to ensure the durability of transactions.
- undo log is called rollback log. Rollback rows are recorded to a specific version. is used to ensure the atomicity and consistency of transactions code>.

Several logs about MySQL:

redo log: It is a log generated by the storage engine layer (InnoDB), which records pages on the physical level Modification operations, such as page number xxx and offset yyy written to ‘ddd’ data, are mainly to ensure the reliability of the data.
undo log: It is the log generated by the storage engine layer (InnoDB), which records the logical operation log, ensuring atomicity of the transaction. For example, when an insert operation is performed on a certain piece of data, undo log will record an opposite delete operation. Mainly used for transaction rollback (undo log records the reverse operation of each modification operation) and consistent non-locking read (undo log rollback row records to a certain A specific version—–MVCC, that is, multi-version concurrency control).
bin log: It is generated by the database layer.

2. redo log

2.1. Composition of redo log

? The InnoDB storage engine manages storage space in units of pages. Before actually accessing the page, you need to cache the page on the disk to the Buffer Pool in the memory before accessing it. All changes must update the data in the buffer pool first, then the dirty pages in the buffer pool will be flushed to disk at a certain frequency (checkPoint mechanism), through the buffer Pool to optimize the gap between CPU and disk, so that the overall performance does not drop too fast.

Regarding the composition of the redo log: The writing of the redo log is not directly written to the disk. The InnoDB engine will first write the redo log buffer when writing the redo log, and then certain frequency (according to the disk flushing strategy)Flash into the real redo log file.

The redo log buffer (redo log buffer) is stored in memory and is easily lost.

When the server starts, it is like the operating system has applied for a large piece of continuous memory space called redo log buffer (redo log buffer), which is divided into several consecutive redo log blocks.
Redo log file (redo log file), saved on disk, is persistent.

redo log stores the table space ID, page number, offset and the value that needs to be updated. The storage space required is very small and disk flushing is fast (reducing the frequency of disk flushing). Its characteristics are:

The redo log is written to disk sequentially. During the execution of a transaction, each time a statement is executed, multiple redo logs may be generated, and these redo logs are written to the disk in the order in which they are generated. That is, uses sequential IO, which is faster than random IO code>.
During transaction execution, the redo log is continuously recorded. Suppose a transaction requires inserting 50,000 pieces of data. During this process, it is continuously recorded sequentially in the redo log. The bin log will not be written to the bin log file until the transaction is committed.

2.2, redo log flush strategy

The process of flashing redo log buffer to redo log file does not actually flash to the disk, but only flashes to **File system cache (page cache)**, and the actual writing will be handed over to the system. You decide yourself (for example, when the page cache is large enough).

Therefore, there is a problem with the storage engine InnoDB. If it is left to the system for synchronization, if the system goes down, the data will be lost. For this reason, InnoDB provides the innodb_flush_log_at_trx_commit parameter, which controls how the logs in the redo log buffer are flushed to the redo log file when a transaction is committed.

innodb_flush_log_at_trx_commit Parameter description:

Parameter value	Parameter value description
0	No disk flush operation is performed every time a transaction is submitted. The system defaults to the master thread synchronizing the redo log every 1 second.
1	Every time a transaction is submitted, synchronization and disk flushing operations will be performed (default value).
2	Every time a transaction is committed, only the redo log buffer content is written to the page cache, but no synchronization is performed. The OS decides by itself when to synchronize to disk files.

Note: In addition,The InnoDB engine has a background thread that writes the contents of the redo log buffer to the file system cache (page cache) every 1 second, and then calls the flush Disk operation. Because during transaction execution, redo log records will be written into the redo log buffer, and these redo log records may be flushed by background threads, so a redo log that has not committed a transaction Record, and maybe flush the disk.

When the value is 1, as long as the transaction is submitted successfully, the redo log record will be on the hard disk and no data will be lost. If MySQL hangs up or crashes during the execution of the transaction, this part of the log will be lost, but the transaction will not be submitted, so there will be no loss if the log is lost. D in ACID can be guaranteed, data will never be lost code>, but efficiency is the worst. It is recommended to use the default value. Although the probability of operating system downtime is theoretically smaller than the probability of database downtime, generally since transactions are used, data security is relatively more important.
When the value is 0, the fsnc operation of the redo log is performed every second in the master thread, so the instance crash will lose transactions within 1 second at most (the master thread is responsible for asynchronously refreshing the data in the buffer pool to the disk to ensure that the data consistency). If the value is 0, its IO efficiency is theoretically higher than the value of 1 and lower than the value of 2. This strategy also has the risk of losing data and cannot be guaranteed D.

2.3. What problems does MySQL’s redo log solve?

Transaction persistence: redo log ensures that after the transaction is committed, even if the database fails or crashes, the changes in the transaction can be recovered and persisted. Through redo log, MySQL can re-execute changes that have not been written to the data file after restarting, ensuring data consistency and durability.
Data recovery: When the database crashes or fails, redo log can be used as an important means to recover data. Through records in the redo log, MySQL can roll back uncommitted transactions and recover transactions that have been committed but not written to the data file to ensure the consistency of the database.
Reduce disk I/O operations: The existence of redo log can reduce disk I/O operations. Compared with writing directly to the data file every time a transaction is committed, MySQL can first write the transaction changes to the redo log and flush the disk, and then asynchronously apply these changes to the data file at the appropriate time. This can reduce the number of disk I/O and improve database performance.

All in all, MySQL’s redo log solves issues such as transaction durability, data recovery, and performance optimization, ensuring that the database can maintain data consistency and recoverability in the event of a failure or crash, while improving the overall performance of the database.

3. undo log

3.1, undo log function

① Rollback data
- The undo log is a logical log, which logically restores the database to its original state, with all modifications logically undone, but the data structure and pages themselves may be very different after the rollback.
  
  This is because in a multi-user concurrent system, there may be hundreds or thousands of concurrent transactions. The main task of a database is to coordinate concurrent access to data records. For example, a transaction is modifying certain records in the current page, and at the same time, other transactions are modifying several records in the same page. Therefore, in order to ensure that the ongoing work of other transactions is not affected, a page cannot be returned. Scroll to the beginning of the transaction.
② MVCC
- The implementation of MVCC in the InnoDB storage engine is completed through undo log. When a user reads a row of records, if the record is already occupied by other transactions, the current transaction can read the previous row version information through undo to achieve non-locking reading.

The generation of undo log will be accompanied by the generation of redo log, because undo log also needs persistence protection.

3.2, Type of undo log

In the InnoDB storage engine, undo log is divided into:

① insert undo log
- insert undo log refers to the undo log generated during the insert operation. Because the record of the insert operation is only visible to the transaction itself and not to other transactions (this is a requirement for transaction isolation), the undo log can be deleted directly after the transaction is submitted without the need for a purge operation.
② pdate undo log
- The update undo log records the undo log generated by delete and update operations. The undo log may need to provide an MVCC mechanism, so it cannot be deleted when the transaction is committed. When submitting, put it into the undo log linked list and wait for the purge thread to perform the final deletion.
  
  The purge thread has two main functions: cleaning up the undo page and clearing the data rows with the Delete_Bit flag in the page. In InnoDB, the delete operation in a transaction is not actually deleting the data row, but a Delete Mark operation that marks the Delete_Bit on the record without deleting the record. The real deletion is completed by the background purge thread.

3.3, life cycle of undo log

To give an example of the life cycle of undo log, suppose there are two values A = a1, B =b1, and then A is modified to a2 and B is modified to b2.

1. start transaction;
2. Record A=a1 to undo log;
3. update A=a2;
4. Record A=a2 to redo log;
5. Record B=b1 to undo log;
6. update B=b2;
7. Record B=b2 to redo log;
8. Flush redo log to disk;
9. commit;

If the system crashes in any step 1-8 and the transaction is not committed, the transaction will not have any impact on the data on the disk.
If it goes down between 8 and 9, you can choose to roll back after recovery, or you can choose to continue to complete the transaction submission, because the redo log has been persisted at this time.
If the system crashes after step 9 and the changed data in the memory map has not been flushed back to the disk, then after the system recovers, the data can be flushed to the disk according to the redo log.

3.4. Several hidden fields related to transaction rollback

For the InnoDB engine, in addition to the data of the record itself, each row record also has several hidden columns.

DB_ROW_ID: If the primary key is not explicitly defined for the table, and there is no unique index defined in the table, InnoDB will automatically add a hidden column of row_id to the table as the primary key.
DB_TRX_ID: Each transaction will be assigned a transaction ID. When a change is made to a record, the transaction ID of this transaction will be written into trx_id.
DB_ROLL_PTR: Rollback pointer, essentially a pointer to undo log.

4. binlog log

4.1, About binlog log

Binary log (bin log) records all statements of database update events such as DDL and DML executed by the database, but does not include statements that do not modify any data (such as query statements select, show, etc.), which are recorded in the form of events and saved in binary file. Main application scenarios of binlog:

① Used for data recovery. If the MySQL database stops unexpectedly, you can use the binary log file to check what operations the user has performed and what modifications have been made to the database server files, and then based on records in the binary log file to recover the database server.
② Used for data replication. Due to the continuity and timeliness of the log, the master passes its binary log to the slaves to achieve master-slave data consistency.

MySQL’s data backup, primary and backup, primary strong>, Master-slave are inseparable from binlog, and need to rely on binlog to synchronize data and ensure data consistency.

If you want to log all statements (for example, to identify problematic queries), you need to use the general query log.

Note: It is best not to place the database file on the same disk as the log file. When the disk where the database file is located fails, the log file can be used to recover data.

-------------------------------------------------- -----------------------
# View binlog default configuration
show variables like '%log_bin%';
-------------------------------------------------- -------------------
# View the current binary file list and size
show binary logs;
-------------------------------------------------- -------------------
# View binlog format
show variables like 'binlog_format';
-------------------------------------------------- -------------------
# Syntax for restoring data
mysqlbinlog [option] filename|mysql -user -ppass
# filename: log file name
# option: optional, --start-date, --stop-date and --start-position, --stop-position
## --start-date, --stop-date: You can specify the starting time point and ending time point of restoring the database.
## --start-position, --stop-position: You can specify the starting position and ending position of the recovered data.
### Note that when using mysqlbinlog to restore, the file with the smaller number must be restored first.
#Usage examples:
/usr/bin/mysqlbinlog --start-position=120 --stop-position=135 --database=zim-data /var/my_binlog/zim-bin.0001 | /usr/bin/mysql -uroot -p123455 -v zim-data
-------------------------------------------------- -------------------
# Modify the my.cnf or my.ini file of MySQL to set the relevant parameters of the binary log (binlog), which will not be discussed here.

4.2, three formats of binlog

There are three formats of binlog (**Before MySQL 5.7.7, the default binlog format is Statement, and in MySQL 5.7.7 and later versions, the default binlog format is Row **):

① Statement: Every SQL statement that modifies data will be recorded in the binlog.

Advantages: No need to record changes in each row, reducing the amount of binlog logs, saving IO, and improving performance.

Disadvantages: In some cases, master-slave data may be inconsistent, such as executing system() or now(), etc.

② Row: Does not record sql statement context-related information, only saves which record was modified.

Advantages: Row level log content will clearly record the details of each row of data modification, and there will be no problem that stored procedures, functions, and trigger calls and triggers cannot be copied correctly under certain circumstances.

Disadvantages: A large amount of logs will be generated, especially when altering table, the logs will skyrocket.

③ Mixed: The combination of Statement and Row. STATEMENT is used for ordinary replication, and ROW is used for replication in special cases.

Binary log (binlog) can complete the database without loss through the full backup of the database and the incremental information saved in the binary log. Recovery, however, if you encounter a scenario with a large amount of data, many databases and data tables (such as the application of sharding databases and sharding tables), it is very challenging to use binary logs for data recovery because the starting and ending positions are different. Easy to manage. In this case, you can configure the master-slave database server, or even a one master multiple slaves architecture, and synchronize the contents of the binary log file to the slave database through the relay log. In the server, problems such as data anomalies caused by database failures can be effectively avoided.

4.3. Binlog flushing process

During transaction execution, the log is first written to the binlog cache. When the transaction is submitted, the binlog cache is written to the binlog file. Because the binlog of a transaction cannot be split, no matter how big the transaction is, it must be written once, so the system will allocate a block of memory to each thread as the binlog cache.

You can control the binlog cache size of a single thread through the binlog_cache_size parameter. If the storage content exceeds this parameter, it will be temporarily transferred to the disk (Swap). The binlog log flushing process is as follows:

The write in the binlog log flushing flow chart is writing the log to the page cache of the file system, and does not persist the data to the disk, so the speed is relatively fast.

The fsync in the binlog log flushing flow chart is the operation of persisting data to disk.

The timing of write and fsync is controlled by the parameter sync_binlog. The sync_binlog parameter controls when the binary log (binlog) is written to disk. The specific instructions are as follows:

sync_binlog=0: When set to 0, MySQL will not synchronize binlog to disk. Instead, it relies on the operating system to flush the binlog to disk. This setup provides the best performance, but in the event of a power failure or operating system crash, the server may lose some transactions. This value is the default.
sync_binlog=1: MySQL will write the binlog to disk synchronously every time a transaction is submitted. This is the safest setting, but may have a performance impact due to frequent disk IO operations.
sync_binlog=N (N>1): After every N transactions are submitted, MySQL synchronizes the binlog to disk. This setup is a compromise between performance and security.

This parameter is very important to ensure data durability and consistency, especially in high-concurrency and write-intensive scenarios. Considering data security, it is recommended to set sync_binlog to 1 in many scenarios.

4.4, InnoDB two-stage submission

Since the binlog is written by the server layer executor operation and the redo log is written by the storage engine layer, there may be inconsistencies between the two logs. In order to solve the logical consistency problem between redo log and binlog, the InnoDB storage engine uses the two-phase commit scheme. The principle of the two-phase commit solution: split the writing of redo log into two steps: prepare and commit. This is the two-phase commit.

After adopting the two-stage submission plan, the abnormal resolution process is encountered:

① An exception occurs when writing to binlog. When MySQL restores data based on the redo log, it finds that the redo log is still in the prepare stage and there is no corresponding binlog, so the transaction will be rolled back.
② When a MySQL server exception occurs during the redo log setting commit phase, the transaction will not be rolled back when the MySQL service is restarted. Although the redo log is in the prepare phase at this time, the corresponding binlog log can be found through the transaction ID, and MySQL will consider it complete. The transaction will be committed to restore the data.

4.5. The difference between redo log and binlog

Log type and storage engine: Binlog is the log of the MySQL Server layer and is common to all storage engines. And redo log is a log unique to the InnoDB storage engine.
Log content: Binlog is a logical log that records all data change information. The redo log is a physical log, which records the specific modifications of each data page.
Function: Binlog is mainly used for master-slave replication and data recovery. Data synchronization is achieved between master and slave nodes by copying and applying binlog. The redo log is mainly used to ensure the atomicity and durability of transactions, that is, it is used to recover data when a database failure occurs and ensure that the modifications of submitted transactions can be Persistence to database.

5. Extension: MySQL log type

MySQL log type:

Slow query log: records all queries whose execution time exceeds long_query_time, so that we can optimize the query.
General query log: records the start time and termination time of all connections, as well as all instructions sent by the connection to the database server, which is very helpful for actual recovery operation scenarios, problem discovery, and even auditing of database operations.
Error log: records problems that occur when starting, running or stopping the MySQL service, so that we can understand the status of the server and maintain the server.
Binary log: records all statements that change data, which can be used for data synchronization between master and slave servers, as well as data loss-free recovery when the server encounters a failure.
Relay log: used in the master-slave server architecture, an intermediate file used by the slave server to store the contents of the master server’s binary log. The slave server synchronizes the operations on the master server by reading the contents of the relay log. (Available starting from MySQL8.0)
Data definition statement log: records metadata operations performed by data definition statements. (Available starting from MySQL8.0)

Except for binary logs, other logs are text files. By default, all logs are created in the MySQL data directory. Not all logging functions need to be used in actual work. The logging functions will reduce the performance of the MySQL database and also occupy a large amount of disk space.

Series of articles:

1: “Understanding the architecture and execution process of MySql”
Two: “Understanding the index from the data structure of the InnoDB index”
Three: “Look at index structure selection from the comparison of Hash index, binary tree, B-Tree and B + Tree”
Four: “MySQL Index Classification and Design Principles”
Five: “MySQL Optimization Ideas”
Six: “Understanding MySQL log redo, undo, binlog”
Seven: “Problems that may occur at different isolation levels under concurrent transactions”
Eight: “Multi-dimensional sorting out MySQL locks”
Nine: “MySQL Multi-version Concurrency Control MVCC”