Directory
-
-
- introduce
- 1. Prepare test data
- 2. Change one line to multiple lines according to the delimiter
- 3. Change one column to multiple columns and simply divide the columns into columns
- 4. One column becomes multiple columns, data pivot
- 5. Change multiple columns into one column
-
Introduction
In interviews and practical problems, we often encounter various row-column conversions, such as data changing from one row to multiple rows according to delimiters, data from one column to multiple columns, and multiple columns to one column. Here we will introduce how to handle these scenarios in Mysql.
1. Prepare test data
create table dwd.fact_suject_data ( student_id int null comment 'number', subject_level varchar(255) null comment 'subject level', subject_level_json varchar(255) null comment 'Subject level json data' ); insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (12,'China E,English D,Math E','{"China": "E","English": "D","Math": " E"}'); insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (2,'China B,English B','{"China": "B","English": "B"}'); insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (3,'English A,Math C','{"English": "A","Math": "C"}'); insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (4,'China C,Math A','{"China": "C","Math": "A"}'); insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (5,'China D,English A,Math C','{"China": "D","English": "A","Math": " C"}'); insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (6,'China C,English A,Math D','{"China": "C","English": "A","Math": " D"}'); insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (7,'China A,English E,Math B','{"China": "A","English": "E","Math": " B"}'); insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (8,'China D,English E,Math E','{"China": "D","English": "E","Math": " E"}'); insert into dwd.fact_suject_data(student_id, subject_level,subject_level_json) values (9,'China C,English E,Math C','{"China": "C","English": "E","Math": " C"}');
2. One line becomes multiple lines according to the delimiter
Requirement: I want two columns: student number and subject level. The subject level only displays a single subject level.
Method 1: Use the table that comes with mysql to split strings. Note that help_topic_id starts from 0
select t1.student_id, -- Split the string according to subject_level according to help_topic_id substring_index(substring_index(t1.subject_level, ',', t2.help_topic_id + 1), ',', - 1) subject_level from dwd.fact_suject_data t1 left join mysql.help_topic t2 on t2.help_topic_id <= length(t1.subject_level) - length(replace(t1.subject_level, ',', '')) ;
Method 2: Use recursion to construct an auto-increasing id column yourself (starting from 1 or 0), which is used to split strings
with recursive cte(n) as (select 1 union all select n+1 from cte where n < 100) -- Recurse out the required auto-incrementing id table select t1.student_id, -- Split the string according to n on subject_level substring_index(substring_index(t1.subject_level, ',', t2.n), ',', -1) subject_level from dwd.fact_suject_data t1 left join cte t2 on t2.n <= (length(t1.subject_level) - length(replace(t1.subject_level, ',', '')) + 1) ;
The scripts of the two methods are based on the total number of separators, and sequentially construct numbers that increase to the total number, which are used as substring_index() parameters for cutting. In addition to the above two methods, if there are other tables with such increments and unique The id can also be borrowed.
3. One column becomes multiple columns, simple column arrangement
Requirements: I want three columns: student number, subject and grade;
Based on the results of step 2, simply use substring_index() to split;
select t1.student_id, -- Split the string according to subject_level according to help_topic_id substring_index(substring_index(t1.subject_level, ',', t2.help_topic_id + 1), ',', - 1) subject_level, substring_index(substring_index(substring_index(t1.subject_level, ',', t2.help_topic_id + 1), ',', - 1),' ',1) subject, substring_index(substring_index(substring_index(t1.subject_level, ',', t2.help_topic_id + 1), ',', - 1),' ',-1) level from dwd.fact_suject_data t1 left join mysql.help_topic t2 on t2.help_topic_id <= length(t1.subject_level) - length(replace(t1.subject_level, ',', '')) ;
Requirement: I want the student number and the data corresponding to each subject, and I need to use the subject_level_json field;
subject_level_json is a common json data type, so we can use functions related to json. Below are the commonly used ones. If you want to know more, you can click on the Mysql official website to view and expand;
select json_object('key1', 1, 'key2','abc') json_data, json_extract('{"key1": 1, "key2": "abc"}','$.key1') key1, json_extract('{"key1": 1, "key2": "abc"}','$.key2') key2;
Next, quote directly. If the data type is wrong, you can cast() to convert it. If the value has multiple double quotes, you can replace it with replace().
select student_id, subject_level_json, replace(cast(json_extract(cast(subject_level_json as json),'$.China') as char),'"','') china, replace(cast(json_extract(cast(subject_level_json as json),'$.English') as char),'"','') english, replace(cast(json_extract(cast(subject_level_json as json),'$.Math') as char),'"','') math from fact_suject_dat;
4. One column becomes multiple columns, data pivot
Requirement: I want the student number and the data corresponding to each subject, but the subject_level_json field cannot be used;
After sorting according to step 2, further use case when to convert directly
select student_id, max(case when subject = 'China' then level else null end) china, max(case when subject = 'English' then level else null end) english, max(case when subject = 'Math' then level else null end) math from (select t1.student_id, -- Split the string according to subject_level according to help_topic_id substring_index(substring_index(t1.subject_level, ',', t2.help_topic_id + 1), ',', - 1) subject_level, substring_index(substring_index(substring_index(t1.subject_level, ',', t2.help_topic_id + 1), ',', - 1),' ',1) subject, substring_index(substring_index(substring_index(t1.subject_level, ',', t2.help_topic_id + 1), ',', - 1),' ',-1) level from dwd.fact_suject_data t1 left join mysql.help_topic t2 on t2.help_topic_id <= length(t1.subject_level) - length(replace(t1.subject_level, ',', ''))) t group by student_id ;
You can also further convert the subject_level field by combining mid() or substring() and case when. If you don’t know about string functions, click here;
select student_id, max(case when subject_level like '%China%' then mid(subject_level ,(instr(subject_level,'China') + length('China') + 1), 1) else null end) china, max(case when subject_level like '%English%' then mid(subject_level,(instr(subject_level,'English') + length('English') + 1), 1) else null end) english, max(case when subject_level like '%Math%' then mid(subject_level ,(instr(subject_level,'Math') + length('Math') + 1), 1) else null end) math from fact_suject_data group by student_i;
5. Change multiple columns into one column
Requirement: I want to generate three columns of data: student number, subject and grade based on the results of step 4;
Just use union all for splicing. If it is Oracle, you can use thePIOVT function.
select student_id, concat('China ',china) subject_level from (select student_id, max(case when subject_level like '%China%' then mid(subject_level ,(instr(subject_level,'China') + length('China') + 1), 1) else null end) china, max(case when subject_level like '%English%' then mid(subject_level,(instr(subject_level,'English') + length('English') + 1), 1) else null end) english, max(case when subject_level like '%Math%' then mid(subject_level ,(instr(subject_level,'Math') + length('Math') + 1), 1) else null end) math from fact_suject_data group by student_id) t where china is not null union all select student_id, concat('English ',english) subject_level from (select student_id, max(case when subject_level like '%China%' then mid(subject_level ,(instr(subject_level,'China') + length('China') + 1), 1) else null end) china, max(case when subject_level like '%English%' then mid(subject_level,(instr(subject_level,'English') + length('English') + 1), 1) else null end) english, max(case when subject_level like '%Math%' then mid(subject_level ,(instr(subject_level,'Math') + length('Math') + 1), 1) else null end) math from fact_suject_data group by student_id) t where english is not null union all select student_id, concat('Math ',math) subject_level from (select student_id, max(case when subject_level like '%China%' then mid(subject_level ,(instr(subject_level,'China') + length('China') + 1), 1) else null end) china, max(case when subject_level like '%English%' then mid(subject_level,(instr(subject_level,'English') + length('English') + 1), 1) else null end) english, max(case when subject_level like '%Math%' then mid(subject_level ,(instr(subject_level,'Math') + length('Math') + 1), 1) else null end) math from fact_suject_data group by student_id) t where math is not null;
Need: I want…