Three types of index merge optimization in MySQL? Or can indexes be used?

MySQL’s 3 types of index merging and optimization Or can it use indexes?

Foreword

Previously, we discussed various ways for MySQL to optimize table return: pushing down ICP under index conditions, multi-range read MRR, covering index, etc.

In this article, let’s talk about another method provided by MySQL to optimize table return: index merge index merge

Before reading this article, you need to understand how the MySQL server layer and storage engine layer interact, the difference between secondary indexes and clustered indexes, table returns, etc.

If students don’t know much about this knowledge, you can read back to the previous article:

MySQL’s optimization tool Pushing down index conditions improves performance by 273% under tens of millions of data

MySQL’s optimization tool? How do Multi Range Read and Covering Index optimize table returns?

There is an article in MySQL that causes index failure: Using or will cause index failure.

So will all scenarios fail? With this question, let’s look down

The case uses the seating chart from the previous article and creates two secondary indexes: seat_code and student_id respectively.

CREATE TABLE `seat` (
  `seat_id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'seat ID',
  `seat_code` char(10) DEFAULT NULL COMMENT 'seat code',
  `student_id` bigint(20) DEFAULT NULL COMMENT 'Student ID associated with the seat',
  PRIMARY KEY (`seat_id`),
  KEY `idx_student_id` (`student_id`),
  KEY `idx_seat_code` (`seat_code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

index merge

Under normal circumstances, the optimizer can only choose an index that it thinks is the most cost-effective to generate an execution plan.

But in some cases, multiple indexes can be used for index merging to optimize

There are three ways to optimize index merging:

index merge intersection intersection index merge
index merge union union index merge
index merge sort union sort union index merge

What are the differences between the three methods? Please read below in order:

index merge intersection

index merge intersection is an index merge used for intersection. The intersection is often related to and in the query conditions

What is intersection?

For example, if there are two sets (1, 2, 3) and (2, 3, 4), then the intersection is the value (2, 3) where they both exist.

Take this SQL as an example:

select * from seat
where seat_code = 'caicaiseat' and student_id = 1

When not using index merge optimization, the optimizer may choose seat_code index or student_id index

When using the seat_code index, first find the record that satisfies seat_code = caicaiseat in the index, and then query the clustered index back to the table to obtain the complete record.

Turn off optimization for intersection index merging

SET optimizer_switch='index_merge_intersection=off';

View execution plan

In this scenario, there may be many records that satisfy seat_code = caicaiseat but do not satisfy student_id = 1

If these records also need to be returned to the table, they will still be filtered after being returned to the table, which wastes resources to return these records to the table.

The table return query is not just one more query, it may also be random IO in this query. When the query volume is large, the cost of table return is very high

After merging using intersection index

First use the seat_code index to find the conditions that satisfy seat_code = caicaiseat
Then use the student_id index to find the condition that satisfies student_id = 1
Then perform intersection filtering on them according to the primary key seat_id, and then return the remaining records to the table to reduce the number of times of returning to the table.

(The table is not returned in the figure because it just meets the covering index)

It should be noted that using intersection index merging requires the primary key values to be in order. If the primary key values are out of order for intersection filtering, random IO will be generated when returning the table, and the gain outweighs the loss.

In the secondary index, the primary key values are sorted only when the index columns are equal, so most scenarios where intersection index merging is used are equivalent comparisons =

Enable intersection index merging and view the execution plan

The type type is index merging. These two indexes are used. Additional information shows that intersection index merging is used, and covering indexes are also used without the need for table return.

Since the seat table only has the primary key seat_id, seat code seat_code, and student ID student_id, the columns that need to be queried are all on the secondary index, so there is no need to return the table.

Some students may have noticed: Why not combine seat_code and student_id to form a (seat_code, student_id) joint index?

In fact, (seat_code, student_id) joint index can also be used, but if you want to query student_id separately, the index will become invalid.

index merge union

index merge union is an index merge used for union, and the union is often related to the query condition or

What is a union?

For example, there are two sets (1, 2, 3) and (2, 3, 4), then the union is the sum of their existing values (1, 2, 3, 4)

Take this SQL as an example

select * from seat
where seat_code = 'caicaiseat' or student_id = 1;

When index merge union is not used, the entire table will be scanned directly (clustered index) to determine whether the records meet the conditions in turn.

index_merge_union=off turns off union index merging

index_merge_sort_union turns off the sorted union index merge (it is the next index merge to be explained, which adds sorting on the basis of the union index merge)

When using index merge union

First use the seat_code index to find records that meet the condition seat_code = 'caicaiseat'
Then use the student_id index to find records that meet the condition student_id = 1
Then take the union of their primary key values seat_id and then query back to the table to reduce overhead.

Turn on union optimization and check the execution plan: index merge union has been used

So in the future, don’t stupidly memorize eight-part essays and say or that you can’t use the index~

The prerequisite for using index merge union is similar to index merge intersection and also requires the primary key values to be in order.

index merge sort union

In some scenarios, or will make the optimizer think that the cost of table return is high, so it is not as good as a full table scan, resulting in index failure.

The prerequisite for using index merge union (primary keys are in order) is too strict, and the index cannot be used in many scenarios.

index merge sort union Sorted union index merge: sort the scenario where the primary key values are out of order and then perform the union

For example, in this SQL

select * from seat where seat_code like 'a%' and student_id = 1

The query condition seat_code is no longer an equivalent comparison, so the primary key values of records that satisfy seat_code like 'a%' are not necessarily in order.

The primary key values of the records satisfying student_id = 1 in the seat_code index are ordered.

In order to union the records whose seat_code index satisfies the condition with the records whose seat_code index satisfies the condition

First sort the records that meet the conditions in the seat_code index, and then take the union after sorting.

After turning on sort union, check the execution plan: use index merge sort union

The situation is similar to index merge union. The premise of use can be that the primary keys are out of order. After the primary keys are out of order, they are sorted and then the intersection is obtained.

Summary

Index merge index merge optimization is enabled by default and is divided into three methods: intersection, union, and sort union

The premise for using index merge intersection: and and can use multiple indexes and the primary keys in the results are in order. Records that meet the conditions are found in the corresponding indexes, and the records are filtered by intersection. Perform table return to reduce unnecessary table return overhead

The prerequisite for using index merge union: or and multiple indexes can be used and the primary keys in the results are in order. Records that meet the conditions are found in the corresponding indexes respectively, and the records are union-filtered and then Perform table return to avoid full table scan

The premise of using index merge intersection/union is that the primary key must be in order, because the primary key is out of order and needs to be sorted first before intersection/union, otherwise there will be random IO

Because or in the index merge union can easily cause the optimizer to think that the table return cost is high and then scan the entire table, and the scenario of satisfying the primary key order is too harsh, so use the index merge sort union when the primary key is out of order. Sort and then take the union

Finally (don’t do it in vain, just click three times in a row to get it~)

This article is included in the column. From point to line, from line to surface, build the MySQL knowledge system. Interested students can continue to pay attention.

The notes and cases of this article are included in gitee-StudyJava and github-StudyJava. Interested students can continue to pay attention under stat~

If you have any questions, you can discuss them in the comment area. If you think Cai Cai’s writing is good, you can like, follow, and collect it to support it~

Follow Cai Cai and share more useful information, public account: Cai Cai’s back-end private kitchen

This article is published by OpenWrite, a blog posting platform!