Using database optimization and performance analysis tools
Article directory
- Use of database optimization and performance analysis tools
-
- 1. Steps for database server optimization
- 2. Query system performance parameters
- 3. Locate the SQL that is fully executed: slow query log
- 4. View SQL execution costs: show profile
- 5. Analysis query tool: explain
-
- 5.1id
- 5.2 type
- 6. explain further use
-
- 6.1 explain supports 4 output formats
- 6.2 show warnings
- 7. Analyze the optimizer execution plan: trace
- 8. MySQL monitoring analysis view: sys_schema
-
- 8.1 sys schema view summary
- 8.2 sys schema usage scenarios
-
- Index status
- table related
- Statement related
- IO related
- Innodb related
1. Steps for database server optimization
- Observe whether there are periodic fluctuations in the server, such as Taobao Double 11 and other periodic activities. At this time, you can consider adding cache and changing the cache invalidation policy to solve the problem.
- Enable slow query logging and use SHOW PROFILING and EXPLAIN analysis to see whether the SQL waiting time is long or the SQL execution time is long
- SHOW PROFILING shows a long waiting time, then tune server parameters
- If SHOW PROFILING shows that the execution time is long, then combined with the EXPLAIN tool analysis, consider optimizing from the following three aspects.
- Index design optimization
- There are too many JOIN tables and need to be optimized
- Database table design optimization
- Consider whether the SQL query has reached the bottleneck. If it has not reached the bottleneck, recheck the SQL or perform the above analysis; if it has reached the bottleneck, consider the following two steps.
- Read and write separation (master-slave architecture)
- Sub-database and table (vertical sub-database, vertical sub-table, horizontal sub-table)
Database tuning dimensions:
- The index is invalid and the index is not fully utilized–Index creation
- There are too many JOINs in related queries (design defects or last resort requirements) – SQL Optimization
- Server tuning and various parameter settings (buffering, number of threads, etc.) – Adjust my.cnf
- Too much data–Sub-database and sub-table
2. Query system performance parameters
In MySQL, you can use the show status
statement to query the performance parameters
and execution frequency
of the database server.
show [global|session] status like 'parameter';
- Connections: Number of connections to the MySQL server
- Uptime: The server’s online time, in seconds
- Slow_queries: Number of slow queries
- Innodb_rows_read/Innodb_rows_inserted/Innodb_rows_updated/Innodb_rows_deleted: Query the number of rows queried or affected by the corresponding operation
- Com_select/Com_insert/Com_update/Com_delete: Query the number of corresponding operations
- last_query_cost: Stats SQL query cost and returns the number of pages that need to be read for the previous query
- After executing the SQL statement, get the cost of the current query by looking at the value of the
last_query_cost
variable in the current session. It is usually also a common indicator for us toevaluate the execution efficiency of a query
. This query cost corresponds to the number of pages that theSQL statement needs to read
- Usage scenarios: It is very useful for comparing costs, especially when we have several query methods to choose from.
- After executing the SQL statement, get the cost of the current query by looking at the value of the
SQL query is a dynamic process. From the perspective of page loading, we can draw the following two conclusions:
- Location determines efficiency. If the page exists in the memory, it will be much more efficient than reading it from the disk.
- Batch determines efficiency. Using sequential reading method to read pages in batches, the average reading efficiency of one page will be greatly improved.
So we must first consider the location of data storage. If it is frequently used data, we should try to put it in the buffer pool
. Secondly, we can make full use of the throughput capacity of the disk and read the data in batches at one time, so that The reading efficiency of a single page is also improved.
3. Locating SQL that is fully executed: slow query log
The slow query log will record statements whose response time exceeds the threshold
in MySQL. Specifically, it refers to SQL whose running time exceeds the long_query_time
value, which will be recorded in the slow query log. Default value 10 (seconds)
By default, the MySQL database does not enable the slow query log. If it is not required for tuning, it is generally not recommended to enable this parameter (it will have a certain performance impact)
show [global|session] variables like 'parameters'
Permanent setting method: modify the my.cnf file, add or modify the parameters long_query_time, slow_query_log, slow_query_log_file under [mysqld], and then restart MySQL
[mysqld] slow_query_log=ON #Slow query log switch slow_query_log_file=/var/lib/mysql/slow_query_log.log # Slow query log location long_query_time=3 # Slow query threshold, unit seconds log_output=FILE # Log storage method, log_output='TABLE' means to store the log in the database, so that the log information will be written to the mysql.slow_log table
If you do not specify a storage path, it will be stored in the MySQL data folder by default; if you do not specify a file name, it will default to hostname-slow.log.
Slow log analysis tool: mysqldumpslow, you can use this tool to view slow SQL logs by query time, average query time, etc.
mysqldumpslow --help
Delete the slow query log: Just delete it manually, and then execute the following command to regenerate the query log file.
mysqladmin -uroot -p flush-logs slow
4. View SQL execution costs: show profile
Show Profile is a tool provided by MySQL that can be used to analyze what SQL has done in the current session and the resource consumption of execution. It can be used for SQL tuning measurements. Off by default and saves the results of the last 15 runs.
When using it, it is recommended to log in to mysql using the command window and then execute the corresponding command to check the situation. Direct execution with interface tools will cause a lot of interference.
show variables like 'profiling'; -- View switches set profiling = 'ON'; -- Set switch, ON|OFF show profiles; -- View the last 15 running times show profile; -- View the execution time of each stage of the last SQL statement. You can also view the information_schema.profiling table to view the execution status of the above 15 times. show profile for query 82; -- View the query time with query_id 82 show profile cpu,block io for query 1; -- View the cpu and block io conditions at the same time
executing: SQL execution time. If this time is too long, consider using the explain tool to analyze SQL for SQL optimization. Otherwise, look at other items and consider Database parameter configuration optimization
Common parameters for show profile:
- all: Display all overhead information
- block io: displays block IO overhead
- context switches: context switching overhead
- cpu: Displays CPU overhead information
- ipc: Displays sending and receiving overhead information
- memory: Display memory overhead information
- page faults: Display page fault overhead information
- source: Displays overhead information related to Source_function, Source_file, Source_line
- swaps: Displays swap count overhead information
Things to note in daily development:
- converting HEAP to MyISAM: The query results are too large, the memory is not enough, and the data is moved to the disk.
- Creating tmp table: Create a temporary table. Copy the data to the temporary table first, then delete the temporary table after use
- Copying to tmp table on disk: Copying the temporary table in memory to disk, be careful!
- locked
If any of the above four results appear in the show profile diagnostic results, the SQL statement needs to be optimized.
Note: The show profile command will be deprecated. We can view it from the information_schema.profiling data table (session level). This table will only save the execution status of the last 15 SQLs.
5. Analysis query tool: explain
MySQL provides the explain
statement to view the specific execution plan of the SQL statement. Note: Some statistical information is estimated and not precise.
explain select 1;
Column name | Description |
---|---|
id | The unique id corresponding to each select |
select_type | The type of query corresponding to the select keyword |
table | Table name, each table corresponds to a piece of data, there may be temporary tables |
partitions | Matching partition information |
type | Access method for single table |
possible_keys | Possibly used indexes |
key | actually used Index |
key_len | Unit byte, the actual index length used, mainly for joint indexes Reference meaning |
ref | When using the index column equal value query, the object information that matches the index column for equal value |
rows | The estimated number of records to be read, the smaller the value, the better |
filtered | The percentage of the remaining records of a table after filtering by search conditions. This value of the execution plan record of the driving table in the connection query determines the number of executions of the driven table ( rows * filtered) |
Extra | Describe some additional information in the SQL execution plan, and more accurately describe how MySQL is Execute the given statement, mainly to avoid: Using filesort and Using temporary |
5.1 id
The unique id corresponding to select. When there are multiple tables, there will be multiple records, such as:
select * from s1 join s2; -- There will be two rows of data according to the table situation, but there is only one select statement, so their ids are all 1
If there are multiple selects written in the SQL statement, but the ID numbers cannot match, it means that the optimizer has rewritten the SQL
- If the IDs are the same, they can be considered as a group and executed sequentially from top to bottom.
- In all groups, the larger the id value is, the higher the priority is and the execution is performed first.
- Points of concern: Each ID represents an independent query. The fewer the number of queries for a SQL statement, the better
5.2 type
A record in the execution plan represents the access method used by MySQL when executing a query on a certain table, also known as the “access type”. The type column indicates what the access method is, which is more An important indicator. For example, if you see that the value of the type column is ref
, it indicates that MySQL will use the ref
access method to execute the query on the s1
table.
Complete access method: system
,const
,eq_ref
,ref
,fulltext
, ref_or_null
,index_merge
,index_subquery
,range
,index
,ALL
.
system: When there is only one record in the table, and the statistics of the storage engine used by the table are accurate, such as MyISAM and Memory
select count(*) from f1; -- f1 uses the MyISAM storage engine, and the number of records will be recorded in the table, so it is system
const: When the primary key or unique secondary index column is equivalent to a constant, the access method to a single table is const
select * from f1 where a = 1; -- a is the primary key or unique index
eq_ref: During connection query, if the driven table is accessed through value matching of primary key or unique secondary index column (if the primary key or unique secondary index is a joint index) , all index columns must be compared for equality), then the access method to the driven table is eq_ref
select * from f1 join f2 on f1.id = f2.id; -- f1 needs to check the entire table, and for f2, every piece of data in the driving table f1 can be driven by the primary key or unique in the driven table f2 Index finds corresponding data
ref: When querying a table through equality matching between ordinary secondary index columns and constants, the access method to the table may be ref
select * from f1 where b = 'abc'; -- b is a normal index
ref_or_null: When an equal value matching query is performed on a common secondary index, and the value of the index column can also be a NULL value, the access method to the table may be ref_or_null.
select * from f1 where b = 'abc' or b is null; -- increase the possibility of null in the case of ref
index_merge: When accessing a single table, Union, Intersection, and Sort-Union can be used in certain scenarios. These three index merge methods are used to execute queries.
select * from f1 where a = 'a' or b = 'b'; -- both a and b are indexes, then both a and b indexes will be used
unique_subquery: In some query statements containing IN
subquery, if the query optimizer decides to convert the IN
subquery into If exists
subquery, and the subquery can use the primary key for equivalent matching, then the type
column of the subquery execution plan is unique_subquery
select * from f1 where b in (select id from f2 where f1.a = f2.a) or c = 'abc'; -- id is the primary key of f2, similar to eq_ref
range: Use the index to obtain records of certain range intervals
, then you may use the range
access method
select * from f1 where b in ('a', 'b', 'c'); select * from f1 where b > 'a' and b < 'c';
index: When index coverage can be used, but all index records need to be scanned, the access method of the table may be index
select b from f1 where c = 'x'; -- There are joint indexes a, b, c on f1, so there is no need to return the table when querying
In summary, the resulting values from best to worst are:
system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL
The goal of SQL performance optimization: At least the range
level must be reached, the requirement is the ref
level, preferably the const
level (Alibaba Development Manual)
6. explain further use
6.1 explain supports 4 output formats
Traditional, json, tree and visual interfaces, among which Json has more printing of cost information than the traditional mode. The visual interface requires the use of MySQL Workbench to view the SQL execution plan
explain format=json select * from f1;
6.2 show warnings
After using explain, you can then use the show warnings statement to view extended information related to the query plan of the corresponding query, which are Level
, Code
, and Message
;
If the code
value is 1003, it means that SQL has been rewritten. Through the message
of show warnings, you can view the optimized SQL execution statement
7. Analyze the optimizer execution plan: trace
OPTIMIZER_TRACE
is a tracking function introduced in MySQL 5.6. It can track various decisions made by the optimizer (such as methods of accessing tables, various cost calculations, various conversions, etc.) and will track the results. Logging into the INFORMATION_SCHEMA.OPTIMIZER_TRACE
table This feature is turned off by default. Enable trace, set the format to JSON, and set the maximum memory size that trace can use to avoid complete display due to the default memory being too small during the parsing process. This is just a demonstration, not a detailed analysis.
set optimizer_trace="enabled=on",end_markers_in_json=on; set optimizer_trace_max_mem_size=1000000;
mysql> select * from INFORMATION_SCHEMA.OPTIMIZER_TRACE\G; *************************** 1. row ********************* ******* /* Part 1: SQL statements */ QUERY: select * from niuke.user_profile /* Part 2: Tracking information for statements corresponding to the QUERY field */ TRACE: { "steps": [ { /* Preparatory work */ "join_preparation": { "select#": 1, "steps": [ { "expanded_query": "/* select#1 */ select `niuke`.`user_profile`.`id` AS `id`,`niuke`.`user_profile`.`device_id` AS `device_id`,`niuke`.` user_profile`.`gender` AS `gender`,`niuke`.`user_profile`.`age` AS `age`,`niuke`.`user_profile`.`university` AS `university`,`niuke`.`user_profile` .`gpa` AS `gpa`,`niuke`.`user_profile`.`active_days_within_30` AS `active_days_within_30`,`niuke`.`user_profile`.`question_cnt` AS `question_cnt`,`niuke`.`user_profile`.` answer_cnt` AS `answer_cnt` from `niuke`.`user_profile`" } ] /* steps */ } /* join_preparation */ }, { /* optimize */ "join_optimization": { "select#": 1, "steps": [ { "table_dependencies": [ { "table": "`niuke`.`user_profile`", "row_may_be_null": false, "map_bit": 0, "depends_on_map_bits": [ ] /* depends_on_map_bits */ } ] /* table_dependencies */ }, { "rows_estimation": [ { "table": "`niuke`.`user_profile`", "table_scan": { "rows": 7, "cost": 0.25 } /* table_scan */ } ] /* rows_estimation */ }, { "considered_execution_plans": [ { "plan_prefix": [ ] /* plan_prefix */, "table": "`niuke`.`user_profile`", "best_access_path": { "considered_access_paths": [ { "rows_to_scan": 7, "access_type": "scan", "resulting_rows": 7, "cost": 0.95, "chosen": true } ] /* considered_access_paths */ } /* best_access_path */, "condition_filtering_pct": 100, "rows_for_plan": 7, "cost_for_plan": 0.95, "chosen": true } ] /* considered_execution_plans */ }, { "attaching_conditions_to_tables": { "original_condition": null, "attached_conditions_computation": [ ] /* attached_conditions_computation */, "attached_conditions_summary": [ { "table": "`niuke`.`user_profile`", "attached": null } ] /* attached_conditions_summary */ } /* attaching_conditions_to_tables */ }, { "finalizing_table_conditions": [ ] /* finalizing_table_conditions */ }, { "refine_plan": [ { "table": "`niuke`.`user_profile`" } ] /* refine_plan */ } ] /* steps */ } /* join_optimization */ }, { "join_execution": { "select#": 1, "steps": [ ] /* steps */ } /* join_execution */ } ] /* steps */ } /* Part 3: The tracking information is too long and the truncated part */ MISSING_BYTES_BEYOND_MAX_MEM_SIZE: 0 /* Part 4: Whether the user executing the trace statement has permission to view the object. When there is no defect, the column value is 1 and the trace field is empty. This situation usually occurs when a stored procedure is called */ INSUFFICIENT_PRIVILEGES: 0 1 row in set (0.00 sec)
8. MySQL monitoring and analysis view: sys_schema
Regarding MySQL performance monitoring and problem diagnosis, we generally obtain the desired data from performance_schema
. In the MySQL5.7.7 version, sys_schema
is added, which will performance_schema
and information_schema
are summarized into “views” in a more understandable way. The purpose is to reduce the complexity of querying performance_schema and allow DBA to quickly locate problems. What monitoring tables and views are there? Mastering these will help us get twice the result with half the effort in our development and operation and maintenance processes.
8.1 sys schema view summary
- Host related: Starting with host_summary, it mainly summarizes IO delay information
- Innodb related: Starting with innodb, it summarizes innodb buffer information and transaction waiting for innodb lock information
- I/O related: Starting with io, it summarizes waiting for I/O and I/O usage.
- Memory usage: Starting with memory, it shows the memory usage from the perspective of host, thread, event, etc.
- Connection and session information: processlist and session related views, summarizing session related information
- Table related: The view starting with schema_table shows the statistical information of the table
- Index information: Statistics of index usage, including redundant indexes and unused indexes
- Statement related: starts with statement and contains statement information for performing full table scan, using temporary tables, sorting, etc.
- User related: The view starting with user counts the file I/O and execution statement statistics used by the user.
- Waiting event related information: starting with wait, showing the delay of waiting events
8.2 sys schema usage scenarios
Index situation
# 1. Query redundant index select * from sys.schema_redundant_indexes; # 2. Query unused indexes select * from sys.schema_unused_indexes; # 3. Query index usage select * from sys.schema_index_statistics where table_schema = 'dbname';
Table related
# 1. Query the number of visits to the table select table_schema, table_name, sum(io_read_requests + io_write_requests) as io from sys.schema_table_statistics group by table_schema, table_name order by io desc; # 2. Query tables that occupy more buffer pools select object_schema,object_name,allocated,data from sys.innodb_buffer_stats_by_table order by allocated limit 10; # 3. Full table scan of query table select * from sys.statements_with_full_table_scans where db='dbname';
Statement related
# 1. Monitor the frequency of SQL execution select db,exec_count,query from sys.statement_analysis order by exec_count desc; # 2. Monitor SQL that uses sorting select db,exec_count,first_seen,last_seen,query from sys.statements_with_sorting limit 1; # 3. Monitor SQL that uses temporary tables or disk temporary tables select db,exec_count,tmp_tables,tmp_disk_tables,query from sys.statement_analysis where tmp_tables > 0 or tmp_disk_tables > 0 order by (tmp_tables + tmp_disk_tables) desc;
IO related
# 1. View disk consumption IO files select file,avg_read,avg_write,avg_read + avg_write as avg_io from sys.io_global_by_file_by_bytes order by avg_read limit 10;
Innodb related
# 1. Row lock blocking situation select * from sys.innodb_lock_waits;
Risk warning: When querying through the sys library, MySQL will
consume a lot of resources
to collect relevant information. In severe cases, business requests may be blocked, causing failures. It is recommended that in production,not frequently
query sys or performance_schema, information_schema to complete monitoring, inspection, etc.