How to optimize queries and tables in MySQL and MariaDB – improve query efficiency

Foreword

MySQL and MariaDB are popular choices for database management systems. Both use the SQL query language to enter and query data.

Although SQL queries are simple commands to learn, not all queries and database functions are equally efficient. As the amount of information you store grows, if your database supports a website, this becomes more and more important as the popularity of the website increases.

In this guide, we will discuss some simple ways to improve the speed of MySQL and MariaDB queries. We assume you have already installed MySQL or MariaDB using our guide, appropriate for your operating system.

Table design overview

One of the most basic ways to improve query speed starts with the table structure design itself. This means that before you even start using the software, you need to start thinking about the best way to organize your data.

Here are some questions you should ask yourself:

What is the main purpose of tables?

Predicting how the data in a table will be used often determines the best way to structure the data.

If certain data needs to be updated frequently, it’s usually best to put them in separate tables. Failure to do this results in the query cache (an internal cache maintained in the software) being dumped and rebuilt again and again as it discovers new information. If this happens in a separate table, other columns can continue to take advantage of the cache.

Update operations are generally faster on smaller tables, while in-depth analysis of complex data is often a task best left to large tables because of the overhead of join operations.

What data type is required?

Sometimes, if you can put some limits on the data size up front, it can save you a lot of time in the long run.

For example, if the value of a field is string and the number of valid entries is limited, you can use the enum type instead of the varchar type. This data type is compact and therefore fast to query.

For example, if there are only a few different types of users, you can set admin, moderator, poweruser, user in the column that handles enums.

Which columns do you want to query?

Knowing ahead of time which fields will be queried repeatedly can greatly improve speed.

Indexing the columns you want to use for searching goes a long way. You can add an index when creating a table using the following syntax:

CREATE TABLE example_table (id INTEGER NOT NULL AUTO_INCREMENT, name VARCHAR(50), address VARCHAR(150)
, username VARCHAR(16), PRIMARY KEY (id), INDEX (username) );

This is useful if we know that the user will be searching for information based on their username. This will create a table with these properties:

explain example_table;
 + ---------- + -------------- + ------ + ------ + ---------- ------- + ------- +
| Field | Type | Null | Key | Default | Extra |
 + ---------- + -------------- + ------ + ------ + ---------- ------- + ------- +
| id | int(11) | NO | PRI | NULL | auto_increment | |
| name | varchar(50) | YES | | NULL | |
| address | varchar(150) | YES | | NULL | |
| username | varchar(16) | YES | MUL | NULL | |
 + ---------- + -------------- + ------ + ------ + ---------- ------- + ------- +
4 rows in set (0.00 sec)

As you can see, our table has two indexes. The first is the primary key, in this case the id field. The second one is the index we added for the username field. This will improve queries that utilize this field.

While from a conceptual perspective it is useful to consider which fields should be indexed during creation, it is also simple to add indexes to existing tables. You can add one like this:

CREATE INDEX index_name ON table_name(column_name);

Another way to accomplish the same thing is this:

ALTER TABLE table_name ADD INDEX ( column_name );

Use Explain to find points to index in a query

If your program queries in a predictable manner, you should analyze your queries to ensure that they use indexes whenever possible. This is easily done using the explain function.

We’ll import a MySQL sample database to see how some of this works:

wget https://launchpad.net/test-db/employees-db-1/1.0.6/ + download/employees_db-full-1.0.6.tar.bz2
tar xjvf employees_db-full-1.0.6.tar.bz2
cd employees_db
mysql -u root -p -t < employees.sql

Now we can log in to MySQL in order to run some queries:

mysql -u root -p
use employees;

First, we need to specify that MySQL should not use its cache so we can accurately determine how long these tasks take to complete:

SET GLOBAL query_cache_size = 0;
SHOW VARIABLES LIKE "query_cache_size";

 + ------------------ + ------- +
| Variable_name | Value |
 + ------------------ + ------- +
| query_cache_size | 0 |
 + ------------------ + ------- +
1 row in set (0.00 sec)

Now we can run a simple query on a large dataset:

SELECT COUNT(*) FROM salaries WHERE salary BETWEEN 60000 AND 70000;
 + ---------- +
| count(*) |
 + ---------- +
| 588322 |
 + ---------- +
1 row in set (0.60 sec)

To see how MySQL executes a query, you can add the explain keyword directly before the query:

EXPLAIN SELECT COUNT(*) FROM salaries WHERE salary BETWEEN 60000 AND 70000;
 + ---- + ------------- + ---------- + ------ + ---------- ------ + ------ + --------- + ------ + --------- + --------- ---- +
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
 + ---- + ------------- + ---------- + ------ + ---------- --- + ------ + --------- + ------ + --------- + ---------- - +
| 1 | SIMPLE | salaries | ALL | NULL | NULL | NULL | NULL | 2844738 | Using where |
 + ---- + ------------- + ---------- + ------ + ---------- --- + ------ + --------- + ------ + --------- + ---------- - +
1 row in set (0.00 sec)

If you look at the key field, you’ll see that its value is NULL. This means that no index is used by this query.

Let’s add one and run the query again to see if it speeds things up:

ALTER TABLE salaries ADD INDEX ( salary );
SELECT COUNT(*) FROM salaries WHERE salary BETWEEN 60000 AND 70000;
 + ---------- +
| count(*) |
 + ---------- +
| 588322 |
 + ---------- +
1 row in set (0.14 sec)

As you can see, this significantly improves our query performance.

Another general rule for using exponents is to pay attention to table joins. You should create the index and specify the same data type on any columns that will be used to join the tables.

For example, if you have a table named cheeses and a table named “ingredients”, you may want to join the ingredient_id field of the two tables, which can be of type INT.

We can then create indexes for both fields and our join will speed up.

Optimize queries for speed

When trying to speed up a query, the other half of the equation is optimizing the query itself. Some operations are more computationally intensive than others. There are usually multiple ways to get the same result, some of which avoid expensive operations.

Depending on the query results you use, you may only need a limited number of results. For example, if you just need to know if anyone at the company makes less than $40,000 a year, you could use:

SELECT * FROM SALARIES WHERE salary < 40000 LIMIT 1;
 + -------- + -------- + ---------------- + ---------------- +
| emp_no | salary | from_date | to_date |
 + -------- + -------- + ---------------- + ---------------- +
| 10022 | 39935 | 2000-09-02 | 2001-09-02 |
 + -------- + -------- + ---------------- + ---------------- +
1 row in set (0.00 sec)

This query executes very fast because it basically short-circuits on the first positive result.

If your query uses an “or” comparison and the two components test different fields, your query may be longer than necessary.

For example, if you want to search for employees whose first or last name begins with “Bre,” you must search two separate columns.

SELECT * FROM employees WHERE last_name like 'Bre%' OR first_name like 'Bre%';

This operation might be faster if you search for the first name in one query and the last name in another query, then merge the output. We can use the union operator:

SELECT * FROM employees WHERE last_name like 'Bre%' UNION SELECT * FROM employees WHERE first_name like 'Bre%';

In some cases, MySQL automatically uses the union operation. The example above is actually a case where MySQL does this automatically. You can see if this is the case by checking the type of sort again using explain.

Summary

There are many ways to tailor use cases in MySQL and MariaDB tables and databases. This article contains just a few tips that may be useful to get you started.

These database management systems provide a lot of help documents to teach you how to optimize and adjust different scenarios. The specifics depend heavily on the types of features you wish to optimize, otherwise they will be fully optimized out of the box. Once you determine your needs and master the operations performed, you can learn to adjust the settings of these queries.