The key to efficient data access: Analyze the operating mechanism of MySQL primary key auto-increment!

Article directory

- The concept of primary key self-growth
- Data type for primary key auto-increment
- The step size of the primary key’s self-increase
- Performance optimization of primary key auto-increment
- - Why is the performance optimization of primary key auto-increment needed?
  - Performance optimization solution for primary key auto-increment
  - - 1. Adjust the step size of the primary key’s self-increase
    - 2. Use multiple primary keys to grow columns
  - How to choose the performance optimization solution for primary key auto-increment?
- Application and mechanism of self-growth in database
- Performance issues with AUTO-INC Locking
- Optimizations in InnoDB storage engine
- Disadvantages of self-growth

I am Liao Zhiwei, a Java development engineer, high-quality creator in the Java field, CSDN blog expert, 51CTO expert blogger, Alibaba Cloud expert blogger, Tsinghua University Press contracted author, product soft article creator, technical article review teacher, Questionnaire designer, personal community founder, open source project contributor. I have run fifteen kilometers, climbed Hengshan Mountain on foot, and have experienced losing 20 pounds in three months. I am a ruthless person who likes to lie flat.

Have many years of front-line R&D and team management experience, and have studied the underlying source code of mainstream frameworks (Spring, SpringBoot, Spring MVC, SpringCould, Mybatis, Dubbo, Zookeeper), the underlying architecture principles of message middleware (RabbitMQ, RockerMQ, Kafka), Redis Cache, MySQL relational database, ElasticSearch full-text search, MongoDB non-relational database, Apache ShardingSphere sub-database and table read-write separation, design pattern, domain-driven DDD, Kubernetes container orchestration, etc. Have experience in high-concurrency projects from 0 to 1, using elastic scaling, load balancing, alarm tasks, and self-starting scripts. The highest pressure has been tested on 200 machines. He has rich experience in project tuning.

After years of experience in creating thousands of articles in CSDN, I already have good writing skills. At the same time, I also signed a contract with Tsinghua University Press for four books, which will be published next year. These books include “Java Project Practice – In-depth understanding of common technologies of large Internet companies” for basic, advanced and architecture chapters, and “Decrypting the Programmer’s Thinking Code – Practice of Communication, Speech and Thinking” 》. The specific publishing plan will be adjusted according to the actual situation. I hope all readers can support me!

I hope that all readers will support bloggers who write articles with care. Times have changed now, information is exploding, and the alley is dark. Bloggers really need everyone’s help to continue to shine in this ocean, so hurry up. Move your little hands, click to follow, click to like, click to favorite, and even click to comment, these are the best support and encouragement for bloggers!

Blog homepage: I am Liao Zhiwei
Open source project: java_wxid
Bilibili: I am Liao Zhiwei
Personal Community: The boss behind the scenes
Personal WeChat ID: SeniorRD

At this beautiful moment, I no longer talk nonsense, and now enter the topic to be discussed in the article without delay. Next, I will present the main text content to you.

The concept of primary key self-growth

In databases, we often use primary keys as unique identifiers for specific columns. However, it will be a tedious job if you manually specify the primary key value every time you insert data. At this time, the primary key self-increment technology comes in handy.

Primary key auto-increment means that when we insert a new record into the database, the primary key value will automatically increment to generate a new available unique identifier. In this way, we do not need to manually specify the primary key value, which greatly simplifies the data insertion process.

For example, we have a student table that contains fields such as student ID, name, age, etc. ID is the primary key. It would be very troublesome if we needed to manually assign a unique value to the id every time we inserted new data. However, if we use the primary key auto-increment technology, we only need to specify the id as the auto-increment attribute when defining the table, and then each time new data is inserted, the id will automatically increment to generate a unique value, which is much more convenient.

Primary key self-increasing data type

Data types for primary key auto-increment usually use integer data types, such as INT, BIGINT, etc. These data types are widely used in databases because they offer the following advantages:

Integer data types occupy less storage space than other data types such as strings, so storage space can be saved.
Integer data types are very fast to store and operate in the database, which can improve the response speed and performance of the database.
The value range of the integer data type is relatively wide and can meet most data storage needs.

For example, in a product table, we can set up a self-increasing primary key column to uniquely identify each record. For example:

CREATE TABLE `product` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `name` varchar(50) NOT NULL COMMENT 'product name',
  `price` decimal(10,2) NOT NULL COMMENT 'commodity price',
  `inventory` int(11) NOT NULL COMMENT 'Item inventory',
  `create_time` datetime NOT NULL COMMENT 'Creation time',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='Commodity Table';

In the above SQL statement, we created a self-increasing bigint type primary key column id for the product table. Every time a new product record is inserted, the value of the id column will automatically increase by 1 to ensure that each product has a unique id.

Of course, primary key auto-increment can be used not only for integer data types, but also for other data types, such as date and time types, floating point types, etc. As long as the database supports the auto-increment function, you can set the auto-increment attribute for any column to ensure its uniqueness.

The step size of the primary key’s self-increase

In a database, each table needs to have a primary key, which is used to uniquely identify a row of records in the table. For example, we have a student information table, which contains fields such as student ID, name, age, etc. We can set the student ID as the primary key, so that each student’s information can be uniquely determined by the student ID. When the database inserts a new record, it will automatically assign an increasing value to the primary key field. For example, in the student information table we just mentioned, we set the student ID as the primary key to grow automatically. Then when we insert the first student, the student ID will be assigned a value of 1; when we insert the second student, the student ID will be assigned a value of 2; and so on.

However, sometimes we don’t want the step size of the primary key’s auto-increment to be 1 every time. In this case, we can manually set the step size of the primary key’s auto-increment. For example, we now have an order table, which contains fields such as order number, order time, order amount, etc. We can set the order number as the primary key to grow automatically. However, the ordering time of each order record is very close. If we increment the order number by 1 every time we insert a new record, the order numbers may be very close, which may cause trouble when querying the order later. . Therefore, we can set the primary key auto-increment step to 100, so that every time a new order record is inserted, the order number will automatically increase by 100, thus ensuring that there is sufficient interval between each order number.

So how to set the step size of the primary key’s self-increase? We can add the “auto_increment=step” statement after the primary key field when creating the table. For example, if we want to create a student information table and set the student number as the primary key to grow automatically, and set the step size to 2, we can write it like this:

CREATE TABLE student (
    id INT PRIMARY KEY AUTO_INCREMENT=2,
    name VARCHAR(20),
    ageINT
);

In this way, every time we insert a new student record, the student number will automatically increase by 2.

To summarize, the step size of the primary key’s self-increase is used to control the value of each increment of the primary key field. By setting the step size of the primary key’s self-increase, we can avoid the problem of duplication of primary keys, and also allow sufficient intervals between primary keys to facilitate later queries. When creating a table, we can set the auto-increment step of the primary key by adding the statement “auto_increment=step” after the primary key field.

Performance optimization of primary key auto-increment

Why is the performance optimization of primary key auto-increment needed?

First, we need to understand the concept of primary key self-increasing fragmentation. The primary key is self-increasing and will automatically increment when data is inserted. However, when data is deleted, the primary key value of the deleted data will not be automatically recycled. This will lead to the “fragmentation” of the primary key value, that is, the primary key value is not continuous, but there are many gaps. These gaps directly affect database query performance.

For example: Suppose there is a table with 10,000 rows, and 5,000 rows of data are deleted, then the primary key value will become 1,2,3,6,7,8… instead of 1,2,3. 4,5,6…If you query, it will cause a lot of IO operations and waste.

On the other hand, frequent operations of inserting data will also affect performance, because each insertion requires recalculation of the primary key value, which may increase lock waiting time and CPU load, resulting in delayed response time.

So, how to solve these problems? It is necessary to consider the performance optimization of primary key auto-increment.

Performance optimization plan for primary key auto-increment

1. Adjust the primary key auto-increment step size

Adjusting the primary key auto-increment step size is a common performance optimization solution. The step size is the amount that increases each time. If it is set to 1, it will increase by 1 every time a piece of data is inserted, which is what we often call the default value. However, this will make the primary key values discontinuous and cause fragmentation.

We can try to set the step size to a larger value, such as 1000, so that the primary key values can be continuous within a certain range and reduce fragmentation problems. For example, we can create the table like this:

CREATE TABLE `test` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(20) NOT NULL COMMENT 'name',
  `age` int(11) NOT NULL COMMENT 'age',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 AUTO_INCREMENT=1;

In this table, we can adjust the step size using the following command:

ALTER TABLE `test` AUTO_INCREMENT = 1000;

In this case, the primary key value will increase from 1000 instead of starting from 1.

It should be noted that the step size cannot be set too large, because if the step size is set too large, a lot of primary key values may be wasted, and it will also become troublesome during multi-table related queries.

2. Use multiple primary keys to grow columns

If adjusting the step size does not solve the problem, we can consider using multiple primary key auto-increasing columns. By using multiple auto-increment columns, you can amortize the insertion load and improve performance.

For example: we can create a table like this:

CREATE TABLE `test` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `id2` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(20) NOT NULL COMMENT 'name',
  `age` int(11) NOT NULL COMMENT 'age',
  PRIMARY KEY (`id`, `id2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

In this table, we have defined two auto-increasing columns, id and id2, and their values will automatically increase. The primary key consists of these two columns, which can ensure the uniqueness of each piece of data.

This method can make the insertion operation more distributed and avoid the waiting time and high CPU load caused by the insertion. If your table has many fields, you can isolate the auto-increasing column from other fields.

How to choose the performance optimization solution for primary key auto-increment?

The performance optimization solution for primary key auto-increment can be selected based on the actual situation.

If there are a large number of deletion operations in your table, you can consider adjusting the primary key auto-increment step size to reduce the fragmentation of primary key values and improve query efficiency.

If insert operations in your table are very frequent, consider using multiple auto-increment columns to distribute the load and reduce waiting time and CPU load.

Of course, for different tables, you need to choose according to the actual situation. If the data volume of your table is small, or the continuity of primary key values has no impact on query performance, then there is no need to consider these optimization solutions.

Application and mechanism of self-growth in database

Autoincrement refers to an attribute in a database table and is often used to set the primary key of a database table. In the database, each table containing auto-increment values has an auto-increment counter. This counter will be initialized during the insertion operation, and the auto-increment column will be assigned a value based on the counter value plus 1.

When a new record is inserted, MySQL checks the table’s auto-increment counter, increments its value by 1, and uses it as the primary key value of the new record. MySQL will automatically handle this process internally without requiring the user to manually specify the primary key value.

It should be noted that the value of the self-increasing counter is maintained at the table level, so operating the same table in different MySQL instances may cause the counter to become out of sync. In addition, MySQL will throw a duplicate key error if you specify an existing primary key value when inserting a record.

At the physical level, MySQL allocates a memory for the self-increasing counter to save its current value. This memory is usually small, but enough to hold the auto-increment value range of the entire table. If the auto-increment value exceeds this range, MySQL will throw an overflow error. In order to avoid this situation, the user can expand the range of the self-growth value by adjusting the initial value and step size of the self-growth value.

The commonly used method for self-increment is AUTO-INC Locking, which uses a special table locking mechanism to release the lock immediately after completing the SQL statement that inserts the self-increment value.

Performance issues with AUTO-INC Locking

Although AUTO-INC Locking improves the efficiency of concurrent insertion to a certain extent, there are some performance issues. First, concurrent insert performance for columns with auto-increment values is poor because the transaction must wait for the previous insert to complete. Secondly, inserting a large amount of data for INSERT…SELECT will affect the performance of the insert, because the insert in another transaction will be blocked.

Optimization in InnoDB storage engine

In the InnoDB storage engine, the MySQL primary key auto-increment mechanism is implemented by creating an auto-increment column on the table. When a new row of data is inserted into the table, MySQL looks up the current value of the auto-increment column and increments it by 1 to produce a new unique value. MySQL then assigns that value to the auto-increment column of the newly inserted data row.

In order to optimize the performance of MySQL primary key auto-increment, the InnoDB storage engine uses a special optimization layer to handle auto-increment columns. This level of optimization is called the “auto-increment counter”, which caches the current value of the auto-increment column and increments it when needed. By using this cache, the InnoDB storage engine can avoid frequently updating the values of auto-increment columns, thereby improving performance.

In addition, the InnoDB storage engine also supports optimizing the performance of MySQL primary key auto-increment by using multiple auto-increment columns. This approach, called alternating auto-increment, creates multiple auto-increment columns and alternates them each time data is inserted. This approach reduces contention for self-increasing counters and improves concurrency performance of the entire system.

Starting from MySQL version 5.1.22, the InnoDB storage engine provides a lightweight mutex auto-increment implementation mechanism, which greatly improves the performance of auto-increment value insertion. And starting from this version, the InnoDB storage engine provides a parameter innodb_autoinc_lock_mode to control the self-growing mode. The default value of this parameter is 1. innodb_autoinc_lock_mode has three options:

0: It is the self-increasing implementation method before MySQL version 5.1.22, which is implemented through the AUTO-INC Locking method of table locks.
1: It is the default value. For simple insertion, a mutex will be used to accumulate the counter in the memory. For batch insertion, it is still implemented through the AUTO-INC Locking method of table locking. In this configuration, if rollback operations are not considered, the growth of self-increasing columns is still continuous, and statement-based replication still works well. However, if you use the AUTO-INC Locking method to generate self-increasing values, then you need to wait for the AUTO-INC Locking to be released before performing a simple insertion operation.
2: In this mode, for all inserted statements, the self-increment value is generated through the mutex, not through the AUTO-INC Locking method. This is the highest performance method, but if it is a concurrent insertion, in each During the first insertion, the self-increasing values are not continuous, and there will be problems based on statement-based replication. Therefore, in this mode, row-base replication must be used at all times to ensure maximum concurrency performance and replication. Consistency of master-slave data.

Disadvantages of self-growth

There are four main disadvantages to using self-growth. First, it is strongly dependent on the database. Different databases have different syntax and implementation. Database migration, multi-database version support, and table and database partitioning need to be handled, which will be troublesome. Moreover, when the database is abnormal, the entire system will be unavailable, which is a fatal problem. Second, there is a single point of failure. In the case of a single database or read-write separation or one master and multiple slaves, only one master database can be generated, and there is a risk of a single point of failure. Third, there is the issue of data consistency. Configuring master-slave replication can increase availability as much as possible, but data consistency is difficult to guarantee in special circumstances. Inconsistencies during master-slave switching may lead to repeated number issuance. Fourth, it is difficult to expand. When the performance does not meet the requirements, it is difficult to expand. The ID issuing performance bottleneck is limited to the read and write performance of a single MySQL.

If you need to reprint or move this article, you are very welcome to send me a private message~

I hope that all readers will support bloggers who write articles with care. Times have changed now, information is exploding, and the alley is dark. Bloggers really need everyone’s help to continue to shine in this ocean, so hurry up. Move your little hands, click to follow, click to like, click to favorite, and even click to comment, these are the best support and encouragement for bloggers!

Blog homepage: I am Liao Zhiwei
Open source project: java_wxid
Bilibili: I am Liao Zhiwei
Personal Community: The boss behind the scenes
Personal WeChat ID: SeniorRD

Blogger’s life insights and goals

You cannot stop on the road of program development. If you stop, you will easily be eliminated. If you cannot endure the hardship of self-discipline, you will suffer from mediocrity. Only continuous ability can bring continuous self-confidence. I am a very ordinary programmer. Among the people, apart from my innate beauty, I am only 180cm tall. Even a person like me has been writing blog posts silently for many years.

There is an old saying that goes before being awesome, you have to be a fool to persevere. I hope that through a large number of works, time accumulation, personal charm, luck, and timing, you can create your own technical influence.

My heart is ups and downs, sometimes I am excited, sometimes I am pensive. I hope that I can become a comprehensive talent with superb skills in technology, business and management. I want to be the chief designer of the product architecture route, the leader of the team, the mainstay of the technical team, and a practical expert in corporate strategy and capital planning.

The realization of this goal requires unremitting efforts and continuous growth, but I must work hard to pursue it. Because I know that only by becoming such a talent can I continue to advance in my career and bring real value to the development of the company. In this ever-changing era, I must always be ready to face challenges, keep learning and exploring new areas in order to keep moving forward. I firmly believe that as long as I keep working hard, I will definitely achieve my goals.