Practical Application of Explain Command in Optimizing Query

In MySQL, the EXPLAIN command is a very important query optimization tool, which can help us analyze the execution plan of SQL query statements and how to optimize them. When using the EXPLAIN command, we can get a series of important parameters, which represent the details of each stage of query execution, and understanding the meaning of these parameters is crucial for SQL query optimization. In this article, I will explain in detail the meaning of each parameter in the EXPLAIN command.

First, let’s look at a simple example:

EXPLAIN SELECT * FROM `users` WHERE `id` = 1;

This SQL query queries the users table for rows where id equals 1. Here is the EXPLAIN result of this SQL statement:

 + ---- + ------------- + ------- + ------------ + ------ - + --------------- + --------- + --------- + ------- + ---- -- + ---------- + ------------- +
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
 + ---- + ------------- + ------- + ------------ + ------- + - -------------- + --------- + --------- + ------- + ------ + ---------- + ------------- +
| 1 | SIMPLE | users | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | Using index |
 + ---- + ------------- + ------- + ------------ + ------- + - -------------- + --------- + --------- + ------- + ------ + ---------- + ------------- +

Next, we will analyze the meaning of each field one by one.

id

id is a unique identifier used to distinguish each SELECT statement. In a complex query, which may contain multiple SELECT statements, each SELECT statement will have a different id. In the output of EXPLAIN, queries are considered related if the id is the same.

select_type

select_type indicates the query type. The following are the possible values of select_type and their meanings:

  • SIMPLE: a simple SELECT query without subqueries or UNION queries;
  • PRIMARY: the outermost SELECT query;
  • SUBQUERY: the first SELECT query in the subquery, which returns a value in the final result, which is used as the condition of the main query;
  • DEPENDENT SUBQUERY: The subquery that depends on the external query, the SELECT query in the subquery will be executed repeatedly;
  • DERIVED: derived table, the query contains subqueries as part of the FROM clause;
  • UNION: the second and subsequent SELECT queries in UNION;
  • DEPENDENT UNION: A UNION query that depends on an external query;
  • UNION RESULT: the result set of UNION;
  • DEPENDENT UNION RESULT: depends on the UNION result set of the outer query.

table

table indicates the table involved in the query

partitions

partitions indicates the partitions involved in the query.

type

type indicates the access type used by MySQL to execute the query. The following are the possible values of type and their meanings:

  • system: a table containing only one row, a system table (such as the mysql.user table in MySQL);
  • const: query only one row, equivalent query based on primary key or unique index (eg id = 1);
  • eq_ref: Use a unique index or primary key to query a row of data from a single table;
  • ref: Use a non-unique index to query multiple rows of data from a single table;
  • fulltext: full text search;
  • ref_or_null: similar to ref, but also includes NULL values;
  • index_merge: Use multiple indexes to merge results, such as using OR to connect multiple indexes;
  • unique_subquery: use IN or EXISTS for subquery;
  • index_subquery: Use IN or EXISTS for subquery, but the subquery uses an index;
  • range: query the rows in the range, use an index to search;
  • index: full table scan, but only traverse the index tree;
  • ALL: full table scan.

When optimizing queries, we usually want to avoid access types such as ALL, index or fulltext, and instead want the query to use more efficient Index access method, such as eq_ref, ref or range.

possible_keys

possible_keys indicates the list of indexes that MySQL may use.

key

key represents the index actually used by MySQL.

key_len

key_len indicates the number of bytes used by the index.

ref

ref indicates the index column or constant used by the query.

rows

rows indicates the number of rows that MySQL estimates to scan.

filtered

filtered indicates the proportion of rows in the result set to all matching rows. If filtered is small, it means that the result set of the query is small.

Extra

The Extra field contains extra information for executing the query, usually including the following information:

  • Using where: Indicates that MySQL will use the WHERE clause at the storage engine level to filter the result set;
  • Using index: indicates that MySQL uses a covering index to query data without accessing tables;
  • Using temporary: indicates that MySQL needs to use a temporary table during the query process;
  • Using filesort: Indicates that MySQL needs to sort the result set;
  • Using join buffer: Indicates that MySQL needs to use the connection buffer;
  • Impossible where: Indicates that the WHERE clause always returns false;
  • Select tables optimized away: indicates that MySQL can delete unreferenced tables during the query process;
  • No tables used: Indicates that the query does not need to access any tables.

The above is the detailed introduction of the EXPLAIN command

How to optimize the query?

Through the output of the EXPLAIN command, we can determine where the bottleneck of the query is and then optimize it. Usually, we can start from the following aspects:

  1. Choose the right index

In the EXPLAIN output, you can see the possible_keys and key fields, which represent the possible and actual indexes respectively. If the key field is NULL, it means that the query does not use any index, which is the focus of optimization. In order to improve query efficiency, we should use indexes as much as possible instead of full table scans.

When selecting an index, we need to select an appropriate index according to the type and frequency of query conditions. In general, you can choose an index that exactly matches the conditions used in the WHERE clause. If there are multiple conditions in the query, you can select an index for the intersection (AND) or union (OR) of multiple conditions. Alternatively, a joint index can be used to cover multiple query conditions.

When choosing an index, we also need to pay attention to some performance issues. For example, we should choose indexes based on columns with high data density, avoid using string type indexes, avoid using too many joint indexes, etc.

  1. Reduce data access

In the EXPLAIN output, you can see the type field, which indicates the access type used by MySQL to execute the query. If the type field is ALL or index, it means that the query needs to perform a full table scan, which is the focus of optimization. In order to improve query efficiency, we need to avoid full table scans as much as possible.

One way to reduce data access is to use covering indexes. A covering index means that the query only needs to read data from the index without going back to the data table to find other data. Using a covering index can avoid MySQL from performing a full table scan, thereby greatly improving query efficiency.

In order to use a covering index, we need to choose an appropriate index and include all columns required by the query in the index. If the columns in the index cannot meet the needs of the query, then MySQL needs to go back to the data table to find other data, resulting in performance degradation.

  1. Reduce sorting and grouping

In the EXPLAIN output, you can see the Extra field, which indicates the extra operations that MySQL needs to do. If Using filesort or Using temporary appears in the Extra field, it means that the query needs to be sorted or grouped, which is the focus of optimization. In order to improve query efficiency, we need to reduce sorting and grouping operations as much as possible.

One way to reduce sorting and grouping is to use indexes. By choosing an appropriate index, we can avoid MySQL to perform sorting and grouping operations, thereby improving query efficiency. In addition, we can also use ORDER BY and GROUP BY clauses to clarify the order of sorting and grouping, avoiding additional operations by MySQL.

  1. Avoid implicit type conversions

In the EXPLAIN output, you can see the type field and the key field. If Using where appears in these fields, it means that the query needs to use the WHERE clause for filtering. When performing WHERE filtering, MySQL may perform implicit type conversion on query conditions, resulting in performance degradation.

In order to avoid implicit type conversion, we should use the same value as the data type in the query condition. For example, if the data type of a column is integer, then we should use integer values for queries, not string values or floating point values.

  1. Reduce the number of queries

In the EXPLAIN output, you can see the rows field and the Extra field. If Using index appears in these fields, it means that the query can directly return the result through the index without going back to the data table for query. In this case, the number of queries will be reduced, thereby improving query efficiency.

In order to reduce the number of queries, we should use indexes as much as possible, and avoid using complex query statements such as subqueries and joint queries in queries. In addition, we can also use caching technology to reduce the number of queries, such as using memory caching tools such as Memcached.

Summary

The EXPLAIN command can help us understand the query execution process of MySQL, so as to optimize it. By choosing appropriate indexes, reducing data access, reducing sorting and grouping, avoiding implicit type conversion, and reducing the number of queries, we can improve query efficiency and optimize database performance.

When using the EXPLAIN command, the fields we need to pay attention to include id, select_type, table, type , possible_keys, key, key_len, ref, rows, Extra etc. By analyzing these fields, we can understand the query execution plan, data access method, index usage, data filtering, sorting and grouping operations, query times and other information, so as to optimize.

It should be noted that optimizing queries is not a one-time job, but an ongoing process. In practical applications, we need to continuously analyze and optimize queries to improve the performance and reliability of the database. At the same time, we also need to pay attention to the standardization and rationalization of data to avoid problems such as data redundancy and inconsistency, and fundamentally improve the performance and reliability of the database.

Friends who have not followed can pay attention to a wave, there will be more exciting content later…