[MySQL Advanced] SQL Optimization

5. SQL optimization

5.1 Insert data in bulk

Environment preparation:

CREATE TABLE `tb_user_2` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `username` varchar(45) NOT NULL,
  `password` varchar(96) NOT NULL,
  `name` varchar(45) NOT NULL,
  `birthday` datetime DEFAULT NULL,
  `sex` char(1) DEFAULT NULL,
  `email` varchar(45) DEFAULT NULL,
  `phone` varchar(45) DEFAULT NULL,
  `qq` varchar(32) DEFAULT NULL,
  `status` varchar(32) NOT NULL COMMENT 'User status',
  `create_time` datetime NOT NULL,
  `update_time` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `unique_user_username` (`username`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

When using the load command to import data, appropriate settings can improve the efficiency of the import.

img

load data local infile 'E:/sql1.log' into table tb_user_2 fields terminated by ',' lines terminated by '\
';

load data local infile 'E:/sql2.log' into table tb_user_2 fields terminated by ',' lines terminated by '\
';

For InnoDB type tables, there are several ways to improve the efficiency of import:

1) Primary key sequential insertion

Because InnoDB type tables are saved in the order of primary keys, arranging the imported data in the order of primary keys can effectively improve the efficiency of imported data. If the InnoDB table does not have a primary key, the system will automatically create an internal column as the primary key by default. Therefore, if you can create a primary key for the table, you can use this to improve the efficiency of importing data.

Script file introduction:
    sql1.log ----> primary key in order
    sql2.log ----> primary key out of order

Insert the data in ID order:

img

Insert ID unordered data:

img

Attention appears: [LOAD DATA syntax to upload data]

For specific solutions, please see this note [Problems in Mysql]

2) Turn off uniqueness verification

Execute SET UNIQUE_CHECKS=0 before importing data to turn off uniqueness verification. Execute SET UNIQUE_CHECKS=1 after the import is completed to restore uniqueness verification, which can improve the efficiency of import.

img

3) Submit the transaction manually

If the application uses automatic submission, it is recommended to execute SET AUTOCOMMIT=0 before importing to turn off automatic submission. After the import is completed, execute SET AUTOCOMMIT=1 to turn on automatic submission, which can also improve the efficiency of importing.

img

5.2 Optimize insert statement

When performing data insert operations, you can consider the following optimization solutions.

  • If you need to insert many rows of data into a table at the same time, you should try to use insert statements in multiple value tables. This method will greatly reduce the consumption of connections, closing, etc. between the client and the database. This makes the efficiency faster than a single insert statement executed separately. Example, the original method is: insert into tb_test values(1,’Tom’); insert into tb_test values(2,’Cat’); insert into tb_test values(3,’Jerry’); The optimized solution is: insert into tb_test values(1,’Tom’),(2,’Cat’),(3,’Jerry’);
  • Data insertion occurs within a transaction. start transaction; insert into tb_test values(1,Tom’); insert into tb_test values(2,Cat’); insert into tb_test values(3,Jerry’); commit;
  • Data is inserted in order insert into tb_test values(4,’Tim’); insert into tb_test values(1,’Tom’); insert into tb_test values(3,’Jerry’); insert into tb_test values(5,’Rose’ ); insert into tb_test values(2,’Cat’); After optimization insert into tb_test values(1,’Tom’); insert into tb_test values(2,’Cat’); insert into tb_test values(3,’Jerry’ ); insert into tb_test values(4,’Tim’); insert into tb_test values(5,’Rose’);

5.3 Optimize order by statement

5.3.1 Environment preparation

CREATE TABLE `emp2` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(100) NOT NULL,
  `age` int(3) NOT NULL,
  `salary` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

insert into `emp2` (`id`, `name`, `age`, `salary`) values('1','Tom','25','2300');
insert into `emp2` (`id`, `name`, `age`, `salary`) values('2','Jerry','30','3500');
insert into `emp2` (`id`, `name`, `age`, `salary`) values('3','Luci','25','2800');
insert into `emp2` (`id`, `name`, `age`, `salary`) values('4','Jay','36','3500');
insert into `emp2` (`id`, `name`, `age`, `salary`) values('5','Tom2','21','2200');
insert into `emp2` (`id`, `name`, `age`, `salary`) values('6','Jerry2','31','3300');
insert into `emp2` (`id`, `name`, `age`, `salary`) values('7','Luci2','26','2700');
insert into `emp2` (`id`, `name`, `age`, `salary`) values('8','Jay2','33','3500');
insert into `emp2` (`id`, `name`, `age`, `salary`) values('9','Tom3','23','2400');
insert into `emp2` (`id`, `name`, `age`, `salary`) values('10','Jerry3','32','3100');
insert into `emp2` (`id`, `name`, `age`, `salary`) values('11','Luci3','26','2900');
insert into `emp2` (`id`, `name`, `age`, `salary`) values('12','Jay3','37','4500');

create index idx_emp_age_salary on emp2(age,salary);

5.3.2 Two sorting methods

1). The first is to sort the returned data, which is commonly known as filesort sorting. All sorting that does not directly return sorting results through indexes is called FileSort sorting.

img

2). The second method directly returns ordered data through ordered index sequential scanning. In this case, using index does not require additional sorting and the operation efficiency is high.

img

Sort by multiple fields

img

After understanding MySQL’s sorting method, the optimization goal becomes clear: minimize additional sorting and return ordered data directly through the index. The where condition and Order by use the same index, and the order of Order By is the same as the index order, and the fields of Order by are all in ascending or descending order. Otherwise, additional operations will definitely be required, so FileSort will appear.

5.3.3 Optimization of Filesort

By creating appropriate indexes, the occurrence of Filesort can be reduced. However, in some cases, conditional restrictions cannot make Filesort disappear, so it is necessary to speed up the sorting operation of Filesort. For Filesort, MySQL has two sorting algorithms:

1) Double scan algorithm: Before MySQL4.1, this method was used for sorting. First, the sorting field and row pointer information are retrieved according to the conditions, and then sorted in the sorting area sort buffer. If the sort buffer is not enough, the sorting results are stored in the temporary table temporary table. After sorting is completed, records are read back from the table based on the row pointer. This operation may cause a large number of random I/O operations.

2) One-time scan algorithm: take out all the fields that meet the conditions at one time, then sort in the sort buffer of the sort area and directly output the result set. The memory overhead during sorting is relatively large, but the sorting efficiency is higher than the two-scan algorithm.

MySQL determines whether it is a sorting algorithm by comparing the size of the system variable max_length_for_sort_data with the total size of the fields retrieved by the Query statement. If max_length_for_sort_data is larger, then the second optimized algorithm is used; otherwise, the first one is used.

You can appropriately increase the sort_buffer_size and max_length_for_sort_data system variables to increase the size of the sorting area and improve sorting efficiency.

img

5.4 Optimize group by statement

Since GROUP BY actually also performs sorting operations, and compared with ORDER BY, GROUP BY mainly only adds grouping operations after sorting. Of course, if some other aggregate functions are used during grouping, then some calculations of the aggregate functions are also required. Therefore, in the implementation process of GROUP BY, indexes can also be used like ORDER BY.

If the query contains group by but the user wants to avoid the consumption of sorted results, they can perform order by null to disable sorting. as follows :

drop index idx_emp_age_salary on emp2;

explain select age,count(*) from emp2 group by age;

img

Optimized

explain select age,count(*) from emp group by age order by null;

img

As can be seen from the above example, the first SQL statement requires “filesort”, while the second SQL statement does not require “filesort” due to order by null. As mentioned above, Filesort is often very time-consuming.

Create index:

create index idx_emp_age_salary on emp(age,salary);

img

5.5 Optimizing nested queries

After Mysql4.1 version, SQL subqueries are supported. This technique allows you to use a SELECT statement to create a single column of query results, and then use this result as a filter condition in another query. Using subqueries can complete many SQL operations that logically require multiple steps to complete at one time. It can also avoid transaction or table locks, and it is also easy to write. However, in some cases, subqueries can be replaced by more efficient joins (JOIN).

Example, find all user information with roles:

 explain select * from t_user where id in (select user_id from user_role);

The execution plan is:

img

Optimized :

explain select * from t_user u , user_role ur where u.id = ur.user_id;

img

The reason why join query is more efficient is that MySQL does not need to create a temporary table in memory to complete this logically two-step query.

5.6 Optimize OR conditions

For query clauses containing OR, if you want to use an index, each condition column between OR must use an index, and compound indexes cannot be used; if there is no index, you should consider adding an index.

Get all indexes in the emp table:

img

Example:

explain select * from emp where id = 1 or age = 30;

img

It is recommended to use union instead of or :

img

Let’s compare the important indicators and find that the main difference is type and ref.

type displays the access type, which is a more important indicator. The result values from best to worst are:

system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL

The type value of the UNION statement is ref, and the type value of the OR statement is range. You can see that this is a very obvious gap.

The ref value of the UNION statement is const, and the type value of the OR statement is null. Const represents a constant value reference, which is very fast.

The difference between these two items shows that UNION is better than OR.

5.7 Optimize paging queries

In general, when performing paging queries, the performance can be better improved by creating a covering index. A common and very troublesome problem is limit 2000000,10. At this time, MySQL needs to sort the first 2000010 records, and only returns 2000000 – 2000010 records. The other records are discarded, and the cost of query sorting is very high.

img

5.7.1 Optimization idea one

Complete the sorting and paging operation on the index, and finally associate the other column contents required by the original table query according to the primary key.

img

5.7.2 Optimization idea two

This solution is suitable for tables with auto-incrementing primary keys, and can convert Limit queries into queries at a certain location.

img

5.8 Using SQL prompts

SQL prompts are an important means of optimizing the database. Simply put, they add some artificial prompts to SQL statements to achieve the purpose of optimizing operations.

5.8.1 USE INDEX

After the table name in the query statement, add use index to provide a list of indexes that you want MySQL to refer to, so that MySQL will no longer consider other available indexes.

create index idx_seller_name on tb_seller(name);

img

5.8.2 IGNORE INDEX

If the user simply wants MySQL to ignore one or more indexes, they can use ignore index as a hint.

 explain select * from tb_seller ignore index(idx_seller_name) where name = 'Xiaomi Technology';

img

5.8.3 FORCE INDEX

To force MySQL to use a specific index, use force index as a hint in the query.

create index idx_seller_address on tb_seller(address);

img

er ignore index(idx_seller_name) where name = Xiaomi Technology’;

 

[External link pictures are being transferred...(img-TBZyVwnL-1699837195905)]

5.8.3 FORCE INDEX

To force MySQL to use a specific index, use force index as a hint in the query.

```javascript
create index idx_seller_address on tb_seller(address);

[External link pictures are being transferred…(img-oa5aKzAZ-1699837195905)]

Next, I will divide a study plan for each student!

Study plan

So the question comes again, as a newbie, what should I learn first and what should I learn next?
Since you have asked so straightforwardly, I will tell you what you should start learning from scratch:

Phase 1: Junior Network Security Engineer

Next, I will arrange a one-month basic network security plan for you. After you finish the course, you can basically work in a network security-related job, such as penetration testing, Web penetration, security services, security analysis, etc. ;Among them, if you learn the class guarantee module well, you can also work as a class guarantee engineer.

Comprehensive salary range 6k~15k

1. Network security theoretical knowledge (2 days)
① Understand the relevant background and prospects of the industry and determine the development direction.
②Learn laws and regulations related to network security.
③The concept of network security operations.
④Introduction to MLPS, regulations, processes and specifications for MLPS. (Very important)

2. Penetration testing basics (1 week)
①Penetration testing process, classification and standards
②Information collection technology: active/passive information collection, Nmap tool, Google Hacking
③Vulnerability scanning, vulnerability exploitation, principles, utilization methods, tools (MSF), bypassing IDS and anti-virus reconnaissance
④Host attack and defense drills: MS17-010, MS08-067, MS10-046, MS12-20, etc.

3. Operating system basics (1 week)
①Common functions and commands of Windows system
②Common functions and commands of Kali Linux system
③Operating system security (system intrusion investigation/system reinforcement basis)

4. Computer network basics (1 week)
①Computer network basics, protocols and architecture
②Network communication principles, OSI model, data forwarding process
③Common protocol analysis (HTTP, TCP/IP, ARP, etc.)
④Network attack technology and network security defense technology
⑤Web vulnerability principles and defense: active/passive attacks, DDOS attacks, CVE vulnerability recurrence

5. Basic database operations (2 days)
①Database basics
②SQL language basics
③Database security reinforcement

6. Web penetration (1 week)
①Introduction to HTML, CSS and JavaScript
②OWASP Top10
③Web vulnerability scanning tool
④Web penetration tools: Nmap, BurpSuite, SQLMap, others (Chopper, Miss Scan, etc.)

So, it has taken about a month so far. You have successfully become a “script kiddie”. So do you still want to continue exploring?

Stage 2: Intermediate or senior network security engineer (depending on your ability)

Comprehensive salary range 15k~30k

7. Script programming learning (4 weeks)
in the field of cybersecurity. The ability to program is the essential difference between a “script kiddie” and a true network security engineer. In the actual penetration testing process, in the face of complex and changeable network environments, when commonly used tools cannot meet actual needs, it is often necessary to expand existing tools, or write tools and automated scripts that meet our requirements. At this time, Requires certain programming skills. In the CTF competition, where every second counts, if you want to effectively use homemade script tools to achieve various purposes, you need to have programming skills.

For students who are starting from scratch, I suggest you choose one of the scripting languages Python/PHP/Go/Java and learn to program common libraries.
Set up a development environment and choose an IDE. Wamp and XAMPP are recommended for PHP environments, and Sublime is highly recommended for IDEs;

Learn Python programming. The learning content includes: grammar, regularity, files, networks, multi-threading and other common libraries. We recommend “Python Core Programming”. There is no need to read it all.

Use Python to write exploits for vulnerabilities, and then write a simple web crawler

Learn basic PHP syntax and write a simple blog system

Be familiar with the MVC architecture and try to learn a PHP framework or Python framework (optional)

Understand Bootstrap layout or CSS.

Phase 3: Top Network Security Engineer

If you are interested in getting started with network security, you can click here if you need it Big benefits of network security: Getting started & advanced full set of 282G learning resource package to share for free!

Sharing learning materials

Of course, giving only plans but not learning materials is tantamount to being a hooligan. Here is a [282G] learning material package for network security engineers from entry to proficiency. You can click on the two below Get the QR code link.