[Solved] MySQL: Wrong data appears when executing sql query (caused by MySQL hidden mechanism-type conversion)

Table of Contents

Problem symptoms:

problem analysis:

in conclusion:

Solution:

expand:


Problem phenomenon:

A very strange bug was discovered during project development today:

As shown in the figure, I executed a sql query in the database with “dept_id = 1712651046956421123” as the condition, but the query result showed: dept_id is 1712651046956421122.

At that time, I was checking the service log of the project and troubleshooting the problem before I noticed this strange phenomenon. I thought it was a bug. After checking the information, I finally found out the specific reason. It was indeed a big hole, and I hereby record it.

Problem Analysis:

At first I thought it was a mysql8+ bug, but after thinking about it, it might not be that simple, so I started my experimental test:

First, I used the navicat tool to directly go to the table for visual filtering. I found that the data could not be queried based on “dept_id = 1712651046956421123”. > Then you can query the data:

It can be seen that there is no problem with the filtering function of the navicat tool, so I tried to use these two dept_id values (i.e. 1712651046956421123 and 1712651046956421122) to execute sql query, and found: I can still query the same result:


This made me very confused. I had no clue for a while, so I had to check the information online. Finally, in an explanation on the MySQL official website, I knew the specific reason. The focus was on the place marked in the red box:

Interested friends can take a look, I’ll post the link here:

MySQL :: MySQL 5.7 Reference Manual :: 12.3 Type Conversion in Expression Evaluation

According to the example mentioned in the explanation, I went directly to the database to execute the SQL. I will not post the results one by one here. I will directly post the SQL code and results. Interested friends can verify it by themselves:

select 1712651046956421123 = 1712651046956421123;-- Result: 1
select 1712651046956421122 = 1712651046956421122;-- Result: 1
select 1712651046956421123 = 1712651046956421122;-- Result: 0

select '1712651046956421123' = '1712651046956421123';-- Result: 1
select '1712651046956421123' = 1712651046956421123;-- Result: 1
select '1712651046956421123' = 1712651046956421122;-- Result: 1

select '1712651046956421122' = '1712651046956421122';-- Result: 1
select '1712651046956421122' = 1712651046956421122;-- Result: 1
select '1712651046956421122' = 1712651046956421123;-- Result: 1

I believe some friends have already figured out the reason. I will post a picture below, and I think everyone will understand it completely:

Create table sql as follows:

CREATE TABLE `xxxxxxxxxxxxxx_dept` (
    `name` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL DEFAULT '' COMMENT 'name',
    `dept_id` varchar(1000) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT 'department id',
    PRIMARY KEY (`dept_id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=1712757077034344450 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci ROW_FORMAT=DYNAMIC COMMENT='Department Information Table';

In fact, it is because the field type of dept_id is set to varchar type, which is a string type, but when I execute sql, I pass in an integer value instead of a string type value, which triggers the hiding mechanism of mysql. Data type conversion may trigger unexpected problems. You can know from the explanation of the article:

The translated meaning is probably:

When conversions from string to float and from integer to float occur, they do not necessarily happen in the same way. Integers can be converted to floating point by the CPU, whereas strings are converted bit by bit in operations involving floating point multiplication. In addition, results may be affected by factors such as computer architecture or compiler version or optimization level. One way to avoid such problems is to use CAST() so that the value is not implicitly converted to a float.

As can be seen, the specific reasons for the problems mentioned at the beginning of the article are actually:

When assigning a string type field to an integer type, type conversion will occur. The integer is converted to a floating point number, and the floating point number has precision. When the floating point number exceeds a certain precision value, before the precision value If the data are consistent, they will be considered equal values.

By observing 1712651046956421123 and 1712651046956421122 with the naked eye, you can find that these two values are actually only different in the last digit, and the previous data are exactly the same, so it is consistent with my inference .

Next we will continue to test and verify my inference:

select '1234567890' = 1234567891;-- Result: 0
select '123456789012345' = 123456789012341;-- Result: 0
select '123456789012345678' = 123456789012345671;-- Result: 0
select '1234567890123456789' = 1234567890123456781;-- Result: 1
select '12345678901234567890' = 12345678901234567891;-- Result: 1
select '1234567890123456789012345678901234567890' = 1234567890123456789012345678901234567891111;-- Result: 1
select '12345678901234567899999' = 12345678901234567811111;-- Result: 1
select '1234567890123456789012345678901234567890' = 1234567890123456789012345678901234567891111111111111111;-- Result: 0
select 1234567890123456789 = 1234567890123456781;-- Result: 1

Personal testing found that the critical value of precision is 18 digits. That is to say, if the value length on the left and right sides of = is the same, and the length of the value is >18 digits, if the data types are different, the type will be triggered. conversion, it will directly judge = the first 18 digits on the left and right sides; if they are completely consistent, no matter how different the subsequent data is, they will be considered equal.

Conclusion:

The specific reasons for the problems mentioned at the beginning are as follows:

1. When the fields in the where condition are of different types than the assigned data (mainly strings, integers, and floating point numbers, friends who are interested in other data types can test it themselves) When , the type conversion mechanism will be triggered[String->Integer->Floating-point number].

2. When the type conversion mechanism is triggered, if the following conditions are met, it will be judged as = The data on the left and right sides are equal:

2.1. The data length on both sides is the same;

2.2. The data length on both sides is >18 digits;

2.3. The first 18 digits of the data on both sides are exactly the same;

Solution:

In order to avoid this problem of incorrect result sets caused by the hidden mechanism of MySQL type conversion triggered by queries similar to the example at the beginning of the article, here are some solutions I can think of, restoration and supplements:

1. As mentioned in the article, the MySQL official website has actually given a solution, which is to use the CAST function to convert the data type, so that The mechanism that does not trigger type conversion naturally avoids problems.

select '1234567890123456789' = 1234567890123456781;-- Result: 1
select CAST('1234567890123456789' as UNSIGNED) = 1234567890123456781;-- Result: 0

However, this method is not recommended here because it is intrusive with SQL code, the maintenance cost of code modification is relatively high, and it treats the symptoms but not the root cause.

2. Modify the table fields. When defining the table field type, you should think clearly about what type of data this field will store, and then ensure that the types are corresponding in the code. (For example, the varchar type field corresponds to the String type in JAVA). In this way, when executing SQL, the data type can be guaranteed to be consistent, and the mechanism to prevent type conversion from being triggered can naturally avoid problems.

This approach is highly recommended as problems can be avoided if everyone follows the rules.

Expand:

Of course, some people may ask why MySQL doesn’t just remove this hidden mechanism of type conversion?

You must know that everything has two sides. The hidden mechanism of type conversion actually has many benefits. Naturally, it is not removed because the advantages outweigh the disadvantages. I will not give detailed examples here because it is beyond the scope of this article. A new article may be published in the future. The article explains in detail.