Table of Contents
Create table statement
type of data
Classification of tables
Classification
The difference between internal tables and external tables
Example
Basic operations of internal tables
Knowledge points
Example
Basic operations of external tables
Example
View/modify table
Knowledge points
Example
Default separator
Example
quick mapping table
Knowledge
Example
Data import and export
File data loading and importing
Upload files directly
window page upload
linux local put upload
loadload file
loadMove HDFS files
loadUpload Linux files
insertInsert data
insert overwrite data
File data export
Download files directly
web page download
?edit
get command download file
insert export data
insert export to hdfs
insert exportlinux
hive_shell command
hsql statement export
hsql script export
Table creation statement
create [external] table [if not exists] table name (field name field type, field name field type,…)
[partitioned by (partition field name partition field type)] –Fixed format of partition table
[clustered by (bucket field name) into number of buckets buckets] — bucket table fixed format, note: can be sorted [sorted by (sort field name asc|desc)]
[row format delimited fields terminated by ‘field delimiter’ ] — Custom field delimiter fixed format
[stored as textfile] –Default text format
[location ‘hadfs://domain name’:8020/user/hive/warehouse/library name.db/table name] –default is enough;
Note: 1. Be sure to add a semicolon at the end
2. The order of keywords is from top to bottom, left to right, otherwise an error will be reported
defaultDefault library storage path: /user/hive/warehouse
The default storage path of custom libraries in HDFS: /user/hive/warehouse/library name.db
The default storage path of custom tables in HDFS: /user/hive/warehouse/library name.db/table name
The default storage path of business data files in HDFS: /user/hive/warehouse/library name.db/table name/data file
data type
Basic data types:
Integer: int Decimal: double float String: string varchar (length) Date: date timestamp
Complex data types:
Collection: array Mapping: map Structure: struct Union: union
Table classification
category
There are several types of tables that can be created in Hive, namely:
Internal table: also called management table or managed table
Partition table (partition bucket table)
Bucket table
External table: also called unmanaged table or unmanaged table
Partition table (partition bucket table)
Bucket table
View table formatting information: desc formatted table name;
— Internal table type: MANAGED_TABLE
–External table type: EXTERNAL_TABLE
The difference between internal tables and external tables
Internal table: The internal table that is not modified by the external keyword is an ordinary table. Internal tables are also called management tables or managed tables.
Delete internal tables: directly delete metadata and the stored data itself
External table: What is modified by the external keyword is an external table, that is, a related table. It can also be called an unmanaged table or an unmanaged table.
Deleting external tables: only deletes metadata, not the stored data itself
example
--Create internal tables (also called management tables, managed tables) create table inner_user1( id int, name string ); -- Create external tables (also called unmanaged tables, unmanaged tables) create external table outer_user1( id int, name string ); -- View table structure desc inner_user1; desc outer_user1; -- View table information desc formatted inner_user1; -- Type of external table: MANAGED_TABLE desc formatted outer_user1; -- type of external table: EXTERNAL_TABLE -- Demonstration of deleting internal/external tables --Delete internal tables -- drop deleting the internal table will delete all data related to the table insert into inner_user1 values (1,'Zhang San'); drop table inner_user1; --drop Deleting the external table will only delete the metadata, and the business data in HDFS will be retained insert into outer_user1 values (1,'Zhang San'); drop table outer_user2; --After creating the table again, you can use location to associate the business data in the original HDFS create external table outer_user2( id int, name string ) location '/user/hive/warehouse/hive2.db/outer_user1'; --verify the data select * from outer_user2 limit 5; -- 1 Zhang San
Basic operations of internal tables
Knowledge Point
Create internal table: create table [if not exists] internal table name (field name field type, field name field type,,…) [row format delimited fields terminated by ‘field delimiter’]
Copy internal table:
Method 1: like copy create table table 1 like table 2
Method 2: copy as method create table table 1 as select statement
Delete internal table: drop table internal table name;
Note: The effect of deleting the internal table is that the metadata related to the table in MySQL is deleted, and the business data itself stored in HDFS is also deleted.View table formatting information: desc formatted table name
Note: You can use truncate to clear the internal table data truncate table internal table name
Example
-- 1. Creation and deletion of internal tables -- 1. Demonstrate the creation of internal tables -- Table creation method 1 create table inner_user1( id int, name string ); --Insert data insert into inner_user1 values (1,'Zhang San'); -- Table creation method 2: Quickly copy the table structure create table inner_user2 like inner_user1; --Insert data insert into inner_user2 values (1,'Zhang San'); -- Table creation method 3: Quickly copy tables and table data create table inner_user3 as select * from inner_user1; -- 2. Demonstrate viewing internal table structure details -- Internal table type: MANAGED_TABLE desc formatted inner_user1; desc formatted inner_user2; desc formatted inner_user3; -- 3. Demonstrate deletion of internal tables -- Delete internal table drop table inner_user3; -- Metadata and HDFS business data are deleted --Clear internal table data truncate table inner_user2; --Neither delete nor update can be used delete from inner_user2; -- error report update inner_user2 set name='李思' -- error report
Basic operations of external tables
Create external table: create external table [if not exists] External table name (field name field type, field name field type, …) [row format delimited fields terminated by ‘field delimiter’];
Copy the table: Method 1: Copy the table structure in the like method. Note: the as method cannot be used.
Delete external table: drop table external table name;
Note: The effect of deleting an external table is that the metadata in MySQL is deleted, but the business data itself stored in HDFS is retained.View table formatting information: desc formatted table name; — External table type: EXTERNAL_TABLE
Note: External tables cannot use truncate to clear the data themselves
Example
-- 1. Creation and deletion of external tables -- 1. Demonstrate the creation of external tables -- Table creation method 1 create external table outer_user1( id int, name string ); --Insert data insert into outer_user1 values (1,'Zhang San'); -- Table creation method 2: Quickly copy the table structure create table outer_user2 like outer_user1; --Insert data insert into inner_user2 values (1,'Zhang San'); -- Table creation method 3: Quickly copy tables and table data create external table outer_user3 as select * from outer_user1; -- error -- 2. Demonstrate viewing the detailed information of the external table structure -- Internal table type: EXTERNAL_TABLE desc formatted outer_user1; desc formatted outer_user2; -- 3. Demonstrate deletion of external tables -- Delete external table drop table outer_user2; -- metadata is deleted and HDFS business data is retained --You cannot use truncate to clear external table data truncate table outer_user1; -- error report --Neither delete nor update can be used delete from outer_user1; -- error update outer_user1 set name='李思' -- error report
View/Modify Table
Knowledge points
View all tables: show tables;
View the table creation statement: show create table table name;
View table information: desc table name;
View table structure information: desc table name;
View table formatting information: desc formatted table name; Note: formatted can display detailed informationModify table name: alter table old table name rename to new table name
Adding fields: alter table table name add columns (field name field type);
Field replacement: alter table table name replace columns (field name field type, …);
The field name and field type are modified at the same time: alter table table name change old field name new field name new field type;
Note: The string type cannot be directly changed to the numeric type.Modify table path: alter table table name set location ‘storage path in hdfs’; Note: It is recommended to use the default path
location: When creating a table, the default path /user/hive/warehouse/library name.db/table name is not written. Of course, the path can also be specified directly when creating a table.Modify table properties: alter table table name set tblproperties (‘property name’ = ‘property value’); Note: often used for internal and external table switching
Internal and external table type switching: External table attribute: ‘EXTERNAL’=’TRUE’ Internal table attribute: ‘EXTERNAL’=’FALSE’
Example
-- table view/modify operations -- Verify whether the previous internal and external tables exist and have data. If you have not created them yourself, if you have used them directly, select * from inner_user1 limit 1; select * from outer_user1 limit 1; -- 1. Table viewing operation -- View all tables show tables; -- View table creation statements show create table inner_user1; show create table outer_user1; -- View table basic structure desc inner_user1; desc outer_user1; -- View table formatting details desc formatted inner_user1; desc formatted outer_user1; -- 2. Table modification operation -- Modify table name -- Note: External tables will only modify the metadata table name, and the table directory name in HDFS will not change. alter table inner_user1 rename to inner_user; alter table outer_user1 rename to out_user; -- Modify fields in the table -- Add fields alter table inner_user add columns (age int); alter table out_user add columns (age int); -- Replace fields alter table inner_user replace columns (id int,name string); alter table out_user replace columns (id int,name string); -- Modify fields --string type cannot be modified to integer type alter table inner_user change name u_name varchar(100); alter table out_user change name u_name varchar(100); alter table out_user change name u_name int; -- error report -- Modify the table path (actually not recommended) -- Note: After modifying the path, if the path does not exist, it will not be created immediately. The directory will be automatically generated when data is inserted in the future. alter table inner_user set location '/inner_user'; alter table out_user set location '/out_user'; -- Modify table properties -- Check the type first desc formatted inner_user; -- MANAGED_TABLE desc formatted out_user; -- EXTERNAL_TABLE -- Change internal table to external table alter table inner_user set tblproperties ('EXTERNAL'='TRUE'); --Change external table to internal table alter table out_user set tblproperties ('EXTERNAL'='FLSE'); --Finally check the type desc formatted inner_user; -- EXTERNAL_TABLE desc formatted out_user;-- MANAGED_TABLE
Default delimiter
When creating a table, if you do not specify a separator, the table will only recognize the default separator in the future, which is difficult to print on the keyboard. The display format is generally: \0001, SOH, ^A,□
Example
create table stu( id int, name string ); --Insert data in insert mode, and the data will automatically be connected using the default delimiter. -- 2. Insert data insert into stu values(1,'zhangsan'); -- 3. Verify data select * from stu limit 1; -- Of course, you can also view it in hdfs. The default delimiter is \0001. It will also be displayed as SOH, ^A, mouth
in other tools.
Quick mapping table
Knowledge
Specify the delimiter when creating the table: create [external] table table name (field name field type) row format delimited fields terminated by symbol;
Load data: load data [local] inpath ‘structured data file’ into table table name;
Example
-- Create table create table pdt( dt string, id string, key string, url string ) row format delimited fields terminated by '\t'; --Download Data load data inpath '/source/search_log.txt' into table pdt; --verify the data select * from hive2.pdt limit 2; -- there is data
Data Import and Export
File data loading and import
Upload files directly
window page upload
Requirements: It is known that the emp1.txt file is in the windows/mac system. It is required to use hdfs to save this file and use hivesql to create tables and associated data.
-- 1. First create a table based on data on hive, and then upload the file to the hdfs table path in window/mac create table emp1( id int, name string, sal double, dept string )row format delimited fields terminated by ','; -- windows uses hdfs page to upload files --Visit the page at node1:9870 and upload emp1.txt to the path /user/hive/warehouse/hive02.db/emp1 -- Query data select * from emp1;
linux local put upload
Requirements: It is known that the emp2.txt file is in the Linux system. It is required to use HDFS to save this file and use hivesql to create tables and associated data.
-- 2. First create a table based on data on hive, and then upload the file to the hdfs table path on linux create table emp2( id int, name string, sal double, dept string )row format delimited fields terminated by ','; -- Linux uses hdfs command to upload files -- [root@node1 ~]# hdfs dfs -put emp2.txt /user/hive/warehouse/hive02.db/emp2 -- View data select * from emp2;
load load file
Move the file from the hdfs path to the storage path corresponding to the table: load data inpath ‘HDFS file path’ [overwrite] into table table name;
Upload files from Linux locally to the storage path corresponding to the table: load data local inpath ‘Linux file path’ [overwrite] into table table name;
load move HDFS file
-- Requirement 1: load loads files in hdfs into the table path -- 1. Create a table based on search_log.txt data in HDFS data create table search_log( dt string, id string, key string, url string ) row format delimited fields terminated by '/t'; -- 2. Upload the search_log.txt file in windows to other paths in hdfs, for example: /source -- 3. Use load to move the files in /src of hdfs to the storage path of the hdfs table corresponding to search_log. load data inpath '/source/search_log.txt' overwrite into table search_log; -- 4. Query data select * from search_log;
load upload Linux file
-- Requirement 2: Directly upload the latest search_log.txt file in Linux to the hdfs path corresponding to the search table -- First transfer the search_log.txt file in the information to linux, for example:/output --load command to upload files load data local inpath '/output/000000_0' overwrite into table search_log; -- Query data select * from search_log;
insert insert data
Query data from other tables is ‘appended’ and inserted into the current table: insert into [table] table name select statement;
Query data from other tables and ‘overwrite’ it into the current table: insert overwrite table table name select statement;
-- Requirement 1: Create a search_log_copy table, and then query data from search_log and insert it into the new table create table search_log_copy( dt string, uid string, word string, url string )row format delimited fields terminated by '\t'; -- Check all data from the search_log table and insert it directly into the search_log_copy table insert into table search_log_copy select * from search_log; -- View data select * from search_log_copy;
insert overwrite data
-- Requirement 2: Assuming that the data in the search_log table has changed, it is required to update the latest data to the search_log_copy table insert overwrite table search_log_copy select * from search_log; -- View data select * from search_log_copy;
File data export
Download files directly
web page download
Requirements: It is known that the search_log.txt file is in the /user/hive/warehouse/hive02.db/search_log path of HFDS and needs to be downloaded to the window system.
get command to download file
Requirements: It is known that the search_log.txt file is in the /user/hive/warehouse/hive02.db/search_log path of HFDS and needs to be downloaded to the Linux system.
[root@node1 binzi]# hdfs dfs -get /user/hive/warehouse/hive02.db/search_log/search_log.txt /binzi
insert export data
Query data exported to other HDFS paths: insert overwrite directory ‘hdfs path to store the data’ select statement
Export query data to Linux local: insert overwrite local directory ‘linux stores the data path’ select statement;
Note: 1.overwrite defaults to overwriting, so try to specify an empty directory when specifying the data path.
2. When exporting data, if you do not specify a delimiter, use the default delimiter SOH,0001,?…
Add the specified delimiter for exported data: row format delimited fields terminated by ‘delimiter’
insert export to hdfs
-- Demonstrate insert overwrite to export data to a file -- Syntax: insert overwrite [local] directory file storage path [specify delimiter] select statement; -- Export data to hfds -- Note: If it is the root directory /, the -ext-10000 directory will be automatically created to store the generated 000000_0 file -- But other directories will automatically clear all contents and generate a 000000_0 file, so please note that the export directory should be a new empty directory as much as possible. --Default delimiter insert overwrite directory '/source' select * from search_log1; -- Specify delimiter insert overwrite directory '/output' row format delimited fields terminated by ',' select * from search_log1;
insert export linux
--2.2 export to linux -- [root@node1 ~]# mkdir /output -- Export to the /output directory of Linux and automatically generate the 000000_0 file to store the query results. --Default delimiter insert overwrite local directory '/output' select * from search_log1; -- Specify delimiter insert overwrite local directory '/output' row format delimited fields terminated by ',' select * from search_log1;
hive_shell command
hive command executes sql statement: hive -e “sql statement” > file path to store the result data
hive command executes sql script: hive -f sql script file > file path to store the result data
hsql statement export
# The following commands are executed from the Linux shell command line # 3.1 Use hive -e sql statement to export data [root@node1 ~]# hive -e 'select * from hive02.search_log;' > /home/hs1.txt [root@node1 ~]# cat hs1.txt
hsql script export
# 3.2 Use hive -f script file to export data [root@node1 ~]# echo 'select * from hive02.search_log;' > /home/export.sql [root@node1 ~]# hive -f export.sql > /home/hs2.txt [root@node1 ~]# cat hs2.txt