Sql Server data query optimization

Directory

  • Key points for writing query statements
  • Reasons for slow query speed:
  • How to optimize queries:

Key points for writing query statements

  • To optimize the query, you should try to avoid full table scans, and first consider creating indexes on the columns involved in where and order by.
  • You should try to avoid using the != or <> operator in the where clause, otherwise the engine will give up using the index and perform a full table scan.
  • You should try to avoid null value judgments on fields in the where clause, otherwise the engine will give up using the index and perform a full table scan, such as:
select name from a where code is null

You can set the default value 0 on code, make sure there is no null value in the code column in the table, and then query like this:

select name from a where code =0
  • Left fuzzy queries will result in a full table scan, so try not to use the leading percent sign. Such as the following statement:
 select name from a where code like %a’

To improve efficiency, consider full-text search.

  • You should try to avoid using or in the where clause to connect conditions, otherwise the engine will give up using the index and perform a full table scan, such as:
select name from a where code =1 or code=2

You can query like this:

select name from a where code =1
union all
select name from a where code =2
  • In and not in should also be used with caution, otherwise it will lead to a full table scan, such as:
select name from a where code in (1,2,3)

For continuous values, do not use in if you can use between. For example:

select name from a where code between 1 and 3
  • You should try to avoid performing expression operations on fields in the where clause, which will cause the engine to give up using the index and perform a full table scan. like:
select name from a where code/2=100
--Should be changed to:
select name from a where code=100/2
  • You should try to avoid performing functional operations on fields in the where clause, which will cause the engine to give up using the index and perform a full table scan. like:
select code from a where substring(name,1,3)=’abc’–name code starting with abc
select code from a where datediff(day,createdate,'2023-10-17')=30 --'2023-10-17'generated code
 --Should be changed to:
select code from a where like abc%’
select code from a where createdate>=’2023-10-17′ and createdate<’2023-10-18′
  • Do not perform functions, arithmetic operations, or other expression operations on the left side of “=” in where clauses, otherwise the system may not use the index correctly.
  • When using an index field as a condition, if the index is a composite index, the first field in the index must be used as a condition to ensure that the system uses the index, otherwise the index will not be used, and should be used as much as possible Make the field order consistent with the index order.
  • Do not write meaningless queries, such as generating an empty table structure:
 select code,name into #a from a where 1=0
  --This type of code will not return any result set, but will consume system resources. It should be changed to this:
   create table #a(…)
  • When the amount of data in the subtable is large, using exists instead of in is a good choice:
select name from a where code in(select code from b)
--Replace with the following statement:
select name from a where exists (select 1 from b where b.code=a.code)
  • Not all indexes are effective for queries. SQL optimizes queries based on the data in the table. When there is a large amount of duplicate data in the index column, the SQL query may not use the index. For example, if there is a field sex in a table, male and female are almost Half and half, then even if an index is built on sex, it will not have any effect on query efficiency.
  • The more indexes, the better. Although the index can improve the efficiency of the corresponding select, it also reduces the efficiency of insert and update, because the index may be rebuilt during insert or update, so how to build the index needs to be carefully considered, depending on the specific situation. It depends. It is best not to have more than 6 indexes on a table. If there are too many, you should consider whether it is necessary to build indexes on some columns that are not commonly used.
  • You should avoid updating clustered index data columns as much as possible, because the order of clustered index data columns is the physical storage order of table records. Once the column value changes, the order of the entire table records will be adjusted, which will consume considerable resources. If the application system needs to frequently update clustered index data columns, then you need to consider whether the index should be built as a clustered index.
  • Try to use numeric fields. If fields contain only numerical information, try not to design them as character fields. This will reduce the performance of queries and connections, and increase storage overhead. This is because the engine will compare each character in the string one by one when processing queries and connections, and only one comparison is enough for numeric types.
  • Use varchar/nvarchar instead of char/nchar as much as possible, because first of all, variable length fields have small storage space and can save storage space. Secondly, for queries, the search efficiency in a relatively small field is obviously higher.
  • Do not use select * from a anywhere, replace “*” with a specific list of fields, and do not return any fields that are not used.
  • Try to use table variables instead of temporary tables. If the table variable contains a large amount of data, be aware that the indexes are very limited (only primary key indexes).
  • Avoid frequently creating and deleting temporary tables to reduce the consumption of system table resources.
  • Temporary tables are not unusable, and using them appropriately can make certain routines more efficient, for example, when you need to repeatedly reference a large table or a data set in a frequently used table. However, for one-time events, it is better to use export tables.
  • When creating a new temporary table, if the amount of data inserted at one time is large, you can use select into instead of create table to avoid causing a large number of logs and improve the speed; if the amount of data is not large, in order to alleviate the resources of the system table, you should first create table , and then insert.
  • If temporary tables are used, all temporary tables must be explicitly deleted at the end of the stored procedure, first truncate table, and then drop table. This can avoid long-term locking of system tables.
  • Try to avoid using cursors because cursors are less efficient. If the data operated by the cursor exceeds 10,000 rows, you should consider rewriting it.
  • Before using the cursor-based method or the temporary table method, you should first look for a set-based solution to the problem, which is usually more efficient.
  • Like temporary tables, cursors are not unusable. Using FAST_FORWARD cursors with small data sets is often better than other row-by-row processing methods, especially when several tables must be referenced to obtain the required data. Routines that include “totals” in a result set are usually faster than using a cursor. If development time permits, you can try both the cursor-based method and the set-based method to see which method works better.
  • Set SET NOCOUNT ON at the beginning and SET NOCOUNT OFF at the end of all stored procedures and triggers. There is no need to send a DONE_IN_PROC message to the client after each statement of stored procedures and triggers.
  • Try to avoid returning large amounts of data to the client. If the amount of data is too large, you should consider whether the corresponding requirements are reasonable.
  • Try to avoid large transaction operations and improve system concurrency.

Reasons for slow query speed:

  • The amount of data queried is too large (multiple queries can be used, and other methods can be used to reduce the amount of data)
  • Lock or deadlock (this is also the most common problem of slow query and is a programming design flaw)
  • sp_lock, sp_who, active users view, the reason is read and write competition for resources.
  • Unnecessary rows and columns returned
  • The query statement is not good and there is no optimization.
  • Not enough storage

Methods to optimize queries:

  • According to the query conditions, create indexes, optimize indexes, optimize access methods, and limit the amount of data in the result set. Note that the fill factor should be appropriate (it is best to use the default value of 0). The index should be as small as possible. It is better to use a column with a small number of bytes to build an index (refer to index creation). Do not build a single index for fields with a limited number of values, such as gender fields.
  • Expand the memory of the server. Windows 2000 and SQL server 2000 can support 4-8G of memory. Configure virtual memory: Virtual memory size should be configured based on the services running concurrently on the computer. When running Microsoft SQL Server 2000, consider setting the virtual memory size to 1.5 times the physical memory installed in the computer. If you have additionally installed the full-text retrieval feature and plan to run the Microsoft Search service to perform full-text indexing and queries, consider configuring the virtual memory size to be at least 3 times the physical memory installed in the computer. Configure the SQL Server max server memory server configuration option to 1.5 times the physical memory (half the virtual memory size setting).
  • Note the difference between UNion and UNion all. UNION allgood
  • Be careful to use DISTINCT and do not use it when unnecessary. Like UNION, it will slow down the query. Duplicate records are not a problem in queries.