Mysql series-index type

1. Index type type

Index type classified according to the content of leaf nodes

InnoDB uses the B + tree index model. It is divided into primary key index and non-primary key index according to whether the leaf node stores data (according to the content of the leaf node); non-primary key index includes: ordinary index, unique index, combined index. The leaf node of the primary key index stores is the entire row of data. In InnoDB, the primary key index is also called a clustered index. The content of leaf nodes in non-primary key indexes is the value of the primary key. In InnoDB, non-primary key indexes are also called secondary indexes. Queries based on non-primary key indexes require scanning one more index tree. Therefore, we should try to use primary key queries, also known as table return queries, in our applications. Querying using the primary key index will query the data at once.

According to the storage method of the index, the index can be divided into clustered index and non-clustered index. The characteristic of a clustered index is that the leaf nodes contain complete record rows, while the leaf nodes of a non-clustered index only have all fields and primary key IDs. Non-clustered index is also called auxiliary index and ordinary index. Its leaf nodes only contain one primary key value. To find records through non-clustered index, you must first find the primary key, and then use the primary key to find the corresponding record row in the clustered index. This process is called table return.

In InnoDB, each table must have a clustered index, which is built based on the primary key by default. If there is no primary key in the table, InnoDB will select a suitable column as the clustered index. If no suitable column is found, a hidden column DB_ROW_ID will be used as the clustered index.

1. Ordinary index

The most basic index type in MySQL, no restrictions, duplicates allowed, NULL allowed

CREATE INDEX index_code ON tb_student(code);

2. Unique index

The unique index is similar to the ordinary index. The value of the unique index column must be unique and null values are allowed.
In the case of a composite index, the combination of column values must be unique.
Creating a unique index usually uses the UNIQUE keyword, for example, the phone number must be unique, create a unique index

CREATE INDEX index_phone_number ON tb_student(phone_number);

3. Primary key index

The primary key index is actually a type of unique index, but it does not allow duplicate values or NULL values. PRIMARY KEY must be used, which is created when creating the table structure. It cannot be created by CREATE alone. The primary key is generally an auto-incrementing ID.

4. Full text index

Full-text indexes are mainly used to find keywords in text and can only be created on CHAR, VARCHAR or TEXT type columns. In MySQL, only the MyISAM storage engine supports full-text indexing.
Full-text indexes allow inserting duplicate values and NULLs into indexed columns.
However, for large-capacity data tables, generating full-text indexes consumes time and hard disk space.
Create a full-text index using the FULLTEXT keyword.

CREATE FULLTEXT INDEX index_name ON tb_student(name);

Classify storage methods according to index and data storage methods

Clustered index is also called clustered index. It is not actually a separate index type, but a data storage method. The leaf nodes of the clustered index save all column information of a row of records. In other words, the leaf node of the clustered index contains a complete record row.

All structures belong to B + tree, so what is their difference? The biggest difference is the storage method

Clustered index

Clustered index: Store data and indexes together in order. The physical storage order of data and the order of indexes are consistent. If you find the index, you can find the data.

Non-clustered index

Non-clustered index: The leaf node does not store data, but the address corresponding to the data is stored. That is to say, the address must be queried first, and then the corresponding data must be found on the disk, because the data and index are separated. For example, If we want to find a certain knowledge point in a book, we need to first search the corresponding chapter page number in the table of contents, and then find the corresponding page through the page number. This table of contents is equivalent to the index, and that page is the corresponding data

Advantages
Clustered index queries can directly obtain data, which is more efficient than non-clustered indexes that require secondary queries.
The range query of the clustered index is more efficient because it is arranged in an orderly manner.
Disadvantages
The maintenance cost is high, especially inserting data, which will lead to page split. When data is inserted into a full page, the storage engine will page the page, which will occupy more disk space.
Updating data is expensive and requires moving updated rows to new locations.

Two index rules

Leftmost matching principle

For example, three joint indexes are created for the log table, namely
operate_type operation type,
operate_user operator,
operate_statusoperation status

CREATE INDEX index_type_user_status ON eom_log(operate_type,operate_user,operate_status);

Let’s take an example to explain to see if the index is hit.

explain select * from eom_log where operate_type < 10

First, let’s check a field, operate_type. The results are as follows, hitting the index.

Adding another field, the execution result is also a hit index

explain select * from eom_log where operate_type < 10 and operate_user < 10

So what happens if we check the rightmost field? Sorry, I didn’t go through the index.

Agreed, only the middle fields are queried without indexing.

explain select * from eom_log where operate_user < 10

Therefore, in a joint index composed of multiple fields, the where condition must start matching from the first leftmost field. If not, the index cannot be hit.

Then the question comes again, can the index be hit if only the first and third fields are used as conditions? let’s try

explain select * from eom_log where operate_type < 10 and operate_status < 10

The result shows a hit index, so as long as the leftmost field is included in the where condition, it must be a hit

Then the question comes again. If the order of these three fields is reversed, can the index be hit?

explain select * from eom_log where operate_user < 10 and operate_status < 10 and operate_type < 10

The result shows that the index is hit. Some friends here may ask, isn’t it the leftmost match? In fact, this is the credit of the mysql optimizer. The optimizer will determine the order in which to correct this sql statement with the highest efficiency. , he will arrange the where query order so that he can hit the index, and finally generate the real execution plan. So the mysql query optimizer will eventually execute the query in this order.

Covering Index

Covering index means that SQL only needs to use the index to return the data required for the query, without having to find the primary key through the secondary index and then query the data. To put it bluntly, your query reaches the index, and the data to be returned happens to be the constituent fields of the index, so you can get the desired results without returning to the table;
Simple and crude understanding: If an index contains (or covers) data that satisfies the fields and conditions in the query statement, it is called a covering index. For example, if the primary key query is based on the primary key index, or the following situation

Look at the example, there is a student table student, which has a joint index student number and phone number index_code_phone

SELECT code,phone from tb_user where code = 'a' and phone = 'b'

Then the code and phone to be queried are exactly indexed, so the data can be returned directly when querying, without the need for a second query. This is a covering index.
Advantages of covering index

Reduces disk I/O and improves query speed
Reduces memory usage and saves resources
It reduces lock competition and improves concurrency. It can avoid querying the original data back into the table, thereby reducing access and locking of data rows. If you use index coverage, you only need to lock the index records, not the data records. This can reduce the scope and conflicts of locks and improve concurrency.

Disadvantages of covering indexes

Increased index size, taking up more disk space
Increased data update overhead, because every time the data is updated, the index needs to be updated
Therefore, when creating a covering index, we need to weigh the pros and cons, choose the appropriate fields and order, and avoid creating too many or too large indexes.

Index pushdown

Index Condition Pushdown (ICP for short) is a new feature of MySQL version 5.6. It can reduce the number of table return queries and improve query efficiency. Index pushdown refers to pushing filtering conditions down to the index layer when using a joint index, thereby reducing the number of table returns. Index push-down actually refers to handing over some of the things that the upper layer (service layer) is responsible for to the lower layer (engine layer). The purpose of index pushdown is to reduce the number of table returns, that is, to reduce IO operations. For a clustered index, the data and index are together, and there is no such thing as table backing.

Principle of index pushdown

Let’s take a closer look at the MySQL query without using ICP:
The storage engine reads index records;
Locate and read complete row records based on the primary key value in the index;
The storage engine hands the record to the server layer to detect whether the record satisfies the WHERE condition.

When using ICP, query process:
The storage engine reads index records (not complete row records);
Determine whether the condition part can be checked using the columns in the index. If the condition is not met, the next row of index records will be processed;
If the conditions are met, use the primary key in the index to locate and read the complete row record (the so-called table return);
The storage engine hands the record to the layer, which checks whether the record meets the rest of the condition.

For example, suppose we have an order table that contains the order id, user id, product id, price and other fields. We want to query the product id and price of all orders with a user id of 1 and a price greater than 1000. Then we can create A joint index that contains user ID, price and product ID, so that when scanning the index, you can first filter out records that do not satisfy the user ID of 1 or the price is greater than 1000, without the need to query the order table.

Different from the joint index, the joint index can only take effect when the query condition contains the prefix part of the index, while index pushdown can take effect when the query condition contains any part of the index. For example, if a joint index is (a, b, c), then the index can only be used when the query condition contains a or a and b, but if the query condition only contains b or c, the index cannot be used. However, if index pushdown is used, even if the query conditions only contain b or c, these conditions can be pushed down to the index level for filtering, thereby reducing the number and scope of table returns.

Specific practice of index pushdown

The theory is relatively abstract, let’s go to a practice. Use a user table tuser and create a joint index (name, age) in the table

If there is a requirement now: retrieve all users in the table whose first name is Zhang and whose age is 10 years old. Then, the SQL statement is written like this:

select * from tuser where name like '张%' and age=10;

If you understand the leftmost matching principle of the index, then you know that this statement can only be used when searching the index tree. The first record found that meets the condition has an id of 1

So what are the next steps?
Not using ICP
Before MySQL 5.6, the storage engine found the primary key id (1, 4) of the name likelike ‘Zhang%’ through the joint index, scanned the tables one by one, went to the clustered index to find the complete row record, and the server layer then processed the data according to age. =10 to filter.

Let’s take a look at the schematic:

You can see that we need to return the table twice, which wastes another field age in our joint index.
Use ICP
After MySQL 5.6, the storage engine finds based on the (name, age) joint index. Since the joint index contains columns, the storage engine directly filters according to age=10 in the joint index. Scan back the table one by one according to the filtered data.

Let’s take a look at the schematic:

You can see that the form was only returned once.

In addition, we can also look at the execution plan and see Using index condition in the Extra column, which means index pushdown is used.

Conditions for using index pushdown

Can only be used for range, ref, eq_ref, ref_or_null access methods;

Can only be used for InnoDB and MyISAM storage engines and their partition tables;

For storage engines, index pushdown only applies to secondary indexes (also called auxiliary indexes);

The purpose of index pushdown is to reduce the number of table returns, that is, to reduce IO operations. For a clustered index, the data and index are together, and there is no such thing as table backing.

Conditions that refer to subqueries cannot be pushed down;

Conditions that reference stored functions cannot be pushed down because the storage engine cannot call stored functions.

Advantages and disadvantages of index pushdown

Advantages of index pushdown

Reduces the number and scope of table returns and improves query speed
Reduced data transfer volume and network overhead
Disadvantages of index pushdown
Increased complexity and overhead of index scans
It may lead to misjudgment or missed determination because the index layer cannot handle some complex or vague filtering conditions.

Therefore, when using index pushdown, we need to pay attention to some restrictions and conditions, such as:

Index pushdown only applies to joint indexes
Index pushdown only works with some data types and operators
Index pushdown may be affected by optimizer

Related system parameters

Index condition pushdown is enabled by default. You can use the system parameter optimizer_switch to control whether it is enabled.
View the default status:

mysql> select @@optimizer_switch\G;
*************************** 1\. row
: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan =on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on
1 row in set (0.00 sec)

Switch status:

set ="index_condition_pushdown=off";
set="index_condition_pushdown=on";