MySQL single table is too large, master-slave mode, synchronization mode optimization principle

Article directory

  • Is the performance of MYSQL severely degraded when the data in a single table reaches 20 million?
    • Preface
    • InnoDB index data structure
    • B + tree
  • Sharding Sphere sub-database and table
  • Sharding-JDBC
    • Sharding-JDBC related concept description
      • logical table
      • Broadcast table
      • binding table
    • Sharding-Sharding strategy in JDBC
      • Automatic sharding algorithm
      • Modulo sharding algorithm
      • Hash modulo sharding algorithm
      • Sharding capacity range
      • Standard Sharding Algorithm
        • Row expression sharding algorithm
        • time range sharding algorithm
  • MySQL master-slave mechanism principle
    • Preface
    • Master-slave replication principle
    • Fundamental
    • Reasons for master-slave delay
      • random replay
      • High main library concurrency
      • lock wait
    • Master-slave delayed processing
      • Parallel replication
      • Reduce main library concurrency
      • Read the main library
    • Summarize
      • Master-slave replication principle
      • Reasons for master-slave delay
      • Master-slave delayed processing
  • The pros and cons of MySQL’s three synchronization modes
    • Asynchronous replication
    • semisynchronous replication
    • Fully synchronous replication

The performance of MYSQL is seriously degraded when the data in a single table exceeds 20 million?

Foreword

There is a saying circulating in the Chinese Internet technology circle: if the data volume of a single MySQL table is more than 20 million rows, the performance will be significantly reduced. In fact, this rumor is said to have originated from Baidu. The specific situation is probably like this. When the DBA tested the performance of MySQL, he found that when the size of a single table reached 20 million rows, the performance of SQL operations dropped sharply. Therefore, the conclusion comes from this. Then it was said that Baidu engineers moved to other companies in the industry, and brought this information with them, so this statement spread in the industry.

Later, Alibaba’s “Java Development Manual” proposed that database and table partitioning is recommended only when the number of rows in a single table exceeds 5 million or the capacity of a single table exceeds 2GB. This is supported by Alibaba’s golden iron rule. Therefore, when many people design big data storage, they will use this as a standard to perform table operations.

So is this statement supported by theoretical basis? First of all, this performance problem can be said to be composed of many factors, such as hardware and parameter settings of MYSQL itself, etc. So today we can talk about this from the index structure of MySQL.

InnoDB index data structure

one question?

**How many rows of data can a B+ tree store in InnoDB? The simple answer to this question is: about 20 million. **Why so many? Because this can be calculated, to understand this problem, let’s start with the InnoDB index data structure and data organization method.

We all know that when a computer stores data, it has a minimum storage unit. This is just like the minimum unit of cash we circulate today is a dime. In a computer, the smallest unit of disk storage data is a sector. The size of a sector is 512 bytes.

The smallest unit of a file system (such as XFS/EXT4) is a block, and the size of a block is 4k

Our InnoDB storage engine also has its own minimum storage unit – Page (Page), and the size of a page is 16K.

The size of a file in the file system is only 1 byte, but it has to occupy 4KB of space on the disk.

The size of all innodb data files (files with the suffix ibd) is always an integer multiple of 16384 (16k).

The data in the data table is stored in pages, so how many rows of data can be stored in one page?

Assuming the size of a row of data is 1k, then a page can store 16 rows of such data.

If the database is only stored in this way, then how to find the data becomes a problem, because we do not know which page the data we are looking for exists, and it is impossible to traverse all the pages, which is too slow. So people thought of a way to organize this data in a B + tree way. as the picture shows:

We first sort the data records according to the primary key and store them in different pages (in order to facilitate understanding, we only store 3 records in one page here. In actual situations, many records can be stored). In addition to storing data In addition to pages, there are also pages that store key values + pointers, such as the page with page number=3 in the figure. This page stores key values and pointers to data pages. Such pages are represented by N Composed of key value + pointer. Of course it is also sorted. This form of data organization is called index organized table. Now let’s look at it. How to find a piece of data?

For example, select * from user where id=5;

Here id is the primary key, we search through this B + tree

First find the root page. How do you know where the root page of the user table is? In fact, the root page position of each table is fixed in the table space file, that is, the page with page number = 3 (we will further prove this below). After finding the root page, use the binary search method to locate the page with id = 5. The data should be in the page pointed to by pointer P5, then search further in the page with page number=5. You can also find the record with id=5 through the binary query method:

| 5 | zhao2 | 27 |

Now we know how the primary key index B + tree in InnoDB organizes and queries data. Let’s summarize:

1. The smallest storage unit of the InnoDB storage engine is a page. Pages can be used to store data or key values + pointers. In the B + tree, leaf nodes store data and non-leaf nodes store data. Key value + pointer.

2. The index-organized table determines which page the data is in through the binary search method of non-leaf nodes and pointers, and then finds the required data in the data page;

B + tree

So back to the question we started with, how many rows of data can a B+ tree usually store?

Here we first assume that the height of the B + tree is 2, that is, there is a root node and several leaf nodes. Then the total number of records stored in this B + tree is:

The number of root node pointers * the number of rows recorded in a single leaf node

Above we have stated that the number of records in a single leaf node (page) = 16K/1K = 16. (Here it is assumed that the data size of one row of records is 1K. In fact, the record size of many Internet business data is usually about 1K).

So now we need to calculate how many pointers can be stored in non-leaf nodes?

In fact, this is easy to calculate. We assume that the primary key ID is of bigint type and the length is 8 bytes, and the pointer size is set to 6 bytes in the InnoDB source code, so a total of 14 bytes. We have one page How many such units can be stored in it actually represents how many pointers there are. 16K is the smallest occupied disk unit for a page, that is (16*1024)/14=1170.

Then it can be calculated that a B + tree with a height of 2 can store 1170*16=18720 such data records.

Based on the same principle, we can calculate that a B + tree with a height of 3 can store: 1170117016=21902400 such records.

Therefore, the B+ tree height in InnoDB is generally 1-3 layers, which can satisfy tens of millions of data storage. When searching for data, one page search represents one IO, so querying through the primary key index usually only requires 1-3 IO operations to find the data.

How to get the height of InnoDB primary key index B + tree?

Above, we concluded through inference that the height ofB + tree is usually 1-3. Below we prove this conclusion from another aspect. In the InnoDB table space file, it is agreed that the page number 3 represents the root page of the primary key index, and the page level of the B + tree is stored at the root page offset 64. If the page level is 1, the tree height is 2, and the page level is 2, the tree height is 3. That is, B + tree height = page level + 1; below we will try to find this page level from the actual environment.

The number of data rows in the lineitem table is more than 6 million, and the height of the B + tree is 3. The number of data rows in the customer table is only 150,000, and the height of the B + tree is also 3. It can be seen that despite the large difference in data volume, the height of the two table trees is 3. In other words, there is not much difference in the efficiency of index query between the two tables, because both only need to do 3 IOs. So if there is a table with 10 million rows, then its B + tree height is still 3, and the query efficiency will still not differ much.

The region table has only 5 rows of data, and of course its B+ tree height is 1.

Sharding Sphere database and table

Sharding-JDBC

Sharding-JDBC is a commonly used component. It is positioned as an enhanced version of the JDBC driver. Simply put, it completes the routing and sharding operations related to database sharding and table sharding on the application side. This is also the focus of our analysis at this stage. s component.

We introduced the Sharding-JDBC dependency into the project. When our business code operates the database, it will connect to the database through the Sharding-JDBC code. That is to say, some core actions of sub-database and sub-table, such as SQL parsing, routing, execution, and result processing, are all completed by it, and it works on the client side.

Sharding-JDBC related concept description

Previously, we demonstrated the use of Sharding-JDBC’s sub-database and table functions in two ways. In fact, from this level, Sharding-JDBC is equivalent to enhancing the functions of the JDBC driver, allowing developers to easily configure Complete the implementation of the sub-database and table functions.

In Sharding-JDBC, there are some table concepts that need to be popularized by everyone, such as logical tables, real tables, shard keys, data nodes, dynamic tables, broadcast tables, and binding tables.

Logical table

A logical table can be understood as a view in the database and is a virtual table. It can be mapped to one physical table, or it can be composed of multiple physical tables, and these physical tables can come from different data sources. For mysql, Hbase and ES, to form a logical table, they only need keys with the same meaning. This key is the primary key in mysql, the value used to generate rowkey in Hbase, and the key in ES.

In the previous sharding rule configuration, the definition of the logical table t_order is used. When we operate on the t_order table, it will be mapped to the actual physical table according to the sharding rules to perform related transaction operations, as shown in Figure 7-9 As shown, the logical table will be replaced with the real table name during SQL parsing and routing.

spring.shardingsphere.rules.sharding.tables.t_order.actual-data-nodes=ds-$->{<!-- -->0..1}.t_order_$->{<!-- - ->0..1}

Broadcast table

The broadcast table is also called a global table, which means it will exist redundantly in multiple databases to avoid cross-database query problems.

For example, some basic data such as provinces and dictionaries can be synchronized to each database node in order to avoid cross-database problems when querying these basic data with associated tables after the database is divided into tables. This is called a broadcast table, as shown in Figure 7-10 shown.

Binding table

The data in some of our tables have logical primary and foreign key relationships, such as the order table order_info, which stores the summarized number of products and the amount of the products; the order detail table order_detail, which contains the price, quantity, etc. of each product. Or called a subordinate relationship, the relationship between the parent table and the child table. There are often related query operations between them. If the data of the parent table and the data of the child table are stored in different databases, cross-database related queries are also more troublesome. So can we put the parent table and data and the data belonging to the parent table onto one node?

For example, the data of order_id=1001 is in node1, and all its detailed data are also placed in node1; the data of order_id=1002 is in node2, and all its detailed data is placed in node2. In this way, it is still in the same database during related queries, such as As shown in Figure 7-11

Sharding strategy in Sharding-JDBC

Sharding-JDBC has built-in many commonly used sharding strategies. These algorithms mainly target two dimensions.

Data source sharding
Data table sharding

Sharding-JDBC’s sharding strategy includes sharding keys and sharding algorithms;

The sharding key, the database field used for sharding, is the key field for horizontally splitting the database (table). Example: If the mantissa of the order primary key in the order table is modulo sharded, the order primary key will be the sharding field. If there is no fragmentation field in SQL, full routing will be performed and the performance will be poor. In addition to supporting single sharding fields, ShardingSphere also supports sharding based on multiple fields.

The sharding algorithm is the calculation rule used to implement sharding.

Sharding-JDBC provides a variety of built-in sharding algorithms, including four types:

Automatic sharding algorithm
Standard Sharding Algorithm
Composite Sharding Algorithm
Hinit sharding algorithm

Automatic sharding algorithm

The automatic sharding algorithm is to complete the automatic distribution function of data according to the algorithm expression we configure. Five automatic sharding algorithms are provided in Sharding-JDBC.

Modulo sharding algorithm
Hash modulo sharding algorithm
Range sharding algorithm based on shard capacity
Range sharding algorithm based on shard boundaries
Automatic time period slicing algorithm

Modulo sharding algorithm

The most basic modulo algorithm, it performs a modulo operation based on the value of the sharding field and sharding-count to obtain a result.

Hash modulus sharding algorithm

It is the same as the modulo algorithm. The only difference is that the modulus is taken after obtaining the hash value for the shard key.

Fragment capacity range

Fragmentation capacity range, a simple understanding is to fragment according to the numerical range of a certain field.

Standard Sharding Algorithm

Standard Sharding Strategy (StandardShardingStrategy), which only supports database and table sharding based on a single sharding key (field), Sharding-JDBC provides two algorithm implementations

Row expression sharding algorithm

Type: INLINE
Using Groovy expressions, it provides support for sharding operations of = and IN in SQL statements, and only supports single sharding keys. For a simple sharding algorithm, it can be used through simple configuration to avoid tedious Java code development, such as: t_user_$->{u_id % 8} means that the t_user table is divided into 8 tables according to u_id modulo 8, and the table name is t_user_0 to t_user_7

Time range sharding algorithm

It is similar to the automatic time period sharding algorithm of the previous automatic sharding algorithm.

Type: INTERVAL

Configurable properties:

MySQL master-slave mechanism principle

Foreword

With the increasing number of visits, the capacity of a single database has been stretched. Therefore, the master-slave architecture that separates reading and writing by using the master library to write data and the slave library to read data was derived.

In a production environment, there are many common master-slave architectures. Here we will introduce you to several common architecture patterns.


Master-slave replication principle

After understanding the basic structure and related configuration of the master-slave, we will get to the point.

For the master and slave, the usual operation is that the master library is used to write data, and the slave library is used to read data. The advantage of this is that by spreading the read and write pressure, all requests are avoided from hitting the main library. At the same time, the scalability and load capacity of the system have also been greatly improved through horizontal expansion from the slave library.

But here comes the problem. The data when reading from the slave library must be consistent with the main library, so the data in the main library needs to be synchronized to the slave library after being written. How to maintain data consistency between the master database and the slave database, and how does the master database synchronize data to the slave database in real time?

Basic principles

There are two very important log files during master-slave replication in Mysql:

binlog (binary log file)
relay log (relay log file)

During the master-slave synchronization process, the master library will record all operating events in the binlog. The slave library maintains communication with the master library by starting an I/O thread, and detects whether the binlog log file has changed within a certain time interval. .

If the binlog log changes, the master library generates a binlog dump thread to transfer the binlog to the slave library I/O thread. Copy the binlog from the I/O thread on the library to its own relay log. Finally, the SQL thread in the slave library reads the events in the relay log and replays them to the slave library.

Reasons for master-slave delay

We already know the relevant process of master-slave replication in the above process, but when the master database is updated, the slave database will be synchronized. So why is there a master-slave delay?

Random replay

The operation of writing binlog in the Mysql main library is written sequentially. As we mentioned before, the sequential reading and writing speed of the disk is very fast. Similarly, the speed efficiency of operating logs from the I/O thread in the library is also very high. But don’t forget, there is also a SQL thread for data replay, and the replay process is random writing to the disk. You should understand by now, At a certain moment, the data in the relay log cannot be replayed into the slave database in time, and a master-slave delay will occur.

High main library concurrency

Knowing the replay situation of SQL threads in the slave library, it is certainly not difficult to understand the master-slave delay caused by high concurrency in the main library. At a certain moment, a large number of write requests hit the master database, which means that the binlog must be continuously written. At this time, the SQL threads in the slave database will be overwhelmed, and master-slave delays will naturally occur.

Lock waiting

For SQL single thread, when it encounters blocking, it will wait until the execution is successful before continuing. If the slave library is waiting for a lock due to a query at a certain moment, the following operations will only be performed after the current operation is completed. In the same way, a master-slave delay will occur.

Master-slave delayed processing

Now that we know the cause of the master-slave delay, let’s see how to deal with it.

Parallel replication

Since SQL single-threaded replay is limited in speed, can it be replayed in a multi-threaded manner? After MySQL version 5.6, a parallel replication method is provided, by converting the SQL thread into multiple work threads for replay, thus solving the problem of master-slave delay.

Reduce main library concurrency

You may say, the lower version of the database I am using now cannot be upgraded, so how can I adjust it? For situations where the concurrency of the main library is high, you can only solve the delay by controlling concurrency in this way. Redis is often used.

Read the main library

You must be familiar with this situation. For some data that requires high real-time performance, you cannot read it from the database. If it is delayed for most of a day, you are not allowed to contribute your year-end bonus.

Summary

Master-slave replication principle

There are two very important log files in master-slave replication, binlog and relay log, which are located in the master library and slave library respectively. The binlog is the basis of master-slave replication. Operation events are written to the binlog and transmitted to the slave library through the I/O thread for synchronization.

Reasons for master-slave delay

The replay process of the SQL thread in the slave library is random disk writing, and the SQL thread is single-threaded. Therefore, if the data is not replayed in time, it will cause master-slave delay.
High concurrency in the main library will cause write operations to be continuously written to the binlog, which may be overwhelming for the SQL thread and cause master-slave delay.
If lock waiting is encountered during replay, it is also one of the causes of delay.

Master-slave delayed processing

MySQL version 5.6 and later uses parallel replication to solve the master-slave delay problem caused by SQL single thread. For lower versions, it can be solved by reducing the concurrency of the main library. If the real-time requirements for data are strict, the purpose can be achieved by reading the main library. +

The pros and cons of MySQL’s three synchronization modes

Asynchronous replication


Asynchronous replication is the default synchronization method of MySQL. After the master provides the slave with an account password and IP authorization, the slave can synchronize data from the master, mainly relying on the master’s binlog.

The slave will start two threads, IO Thread and SQL Thread:

IO Thread is responsible for pulling the binlog log from the master and writing it to the relay relay log.
SQL Thread is responsible for replaying the changes in the relay relay log and updating the data to maintain data consistency with the master.
During this process, the slave pulls the binlog through the IO thread. The master does not need to pay attention to whether there is a slave that needs to be synchronized. It only does its own thing. The entire replication process is completed asynchronously. This is asynchronous replication.

The advantage of asynchronous replication is good performance, but the disadvantage is poor data security

At a certain moment, the data difference between the master and the slave may be large. If the master fails and the slave takes over, some data may be lost.

Semi-synchronous replication


After the master update operation is written to the binlog, it will actively notify the slave. After receiving it, the slave will write the relay log to respond. As long as the master receives at least one ack response, it will submit it. affairs.

It can be found that compared with asynchronous replication, semi-synchronous replication needs to rely on at least one slave to write the binlog to the relay log. The performance is reduced, but it can ensure that there is at least one slave (slave database) and master (master database). The data is consistent and data security is improved.

For scenarios with high data consistency requirements, the synchronization strategy of semi-synchronous replication can be used. For example, when the main database fails, the slave database that is ready to take over has very high data consistency requirements.

The advantage of semi-synchronous replication is that data security is good, but the disadvantage is that the performance is slightly lower than asynchronous replication

Full synchronous replication

The difference between fully synchronous replication and semi-synchronous replication is that fully synchronous replication must receive acks from all slaves (slave databases) before committing the transaction.

The transaction submission of the master (main database) depends on all subsequent slaves (slave databases), so the performance will be significantly reduced. Unless it is a scenario where the data consistency requirements of all slaves (slave databases) are very high, otherwise we generally Don’t use this strategy.

Fully synchronous replication has the best data consistency, but the performance is also the worst.