Updated June 16, 2023
Introduction to Hive Table
In the hive, the tables consist of columns and rows and store the related data in the table format within the same database. The table is storing the records or data in tabular format. The tables are broadly classified into two parts, i.e., external table and internal table.
The default storage location of the Table varies from the hive version. From HDP 3.0, we are using hive version 3.0 and more. The default Table location was changed from HDP 3.0 version / Hive version 3.0. The location for the external hive Table is “/warehouse/tablespace/external/hive/” and the location for the manage Table is “/warehouse/tablespace/managed/hive”.
In the older version of the hive, the default storage location of the hive Table is “/apps/hive/warehouse/”.
Syntax
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [ database name ] table name
[( column name data type [ COMMENT column comment], ...)]
[ COMMENT table comment]
[ ROW FORMAT row format]
[ STORED AS file format]
How to create a Table in Hive?
As per the requirement, we can create the tables. We can broadly classify our table requirement in two different ways;
- Hive internal table
- Hive external table
1. Internal Table
The internal table is also called a managed table and is owned by a “hive” only. Whenever we create the table without specifying the keyword “external” then the tables will create in the default location.
If we drop the internal or manage the table, the table DDL, metadata information, and table data will be lost. The table data is available on HDFS it will also lose. We should be very careful while dropping any internal or managing the table.
DDL Code for Internal Table
create table emp.customer
(
idint,
first_name string,
last_name string,
gender string,
company_name string,
job_title string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
lines terminated by '\n'
location "/emp/table1"
tblproperties ("skip.header.line.count"="1");
Output:
2. External Table
The best practice is to create an external table. Many organizations are following the same practice to create tables. It does not manage the data of the external table, and the table is not created in the warehouse directory. We can store the external table data anywhere on the HDFS level.
The external tables have the facility to recover the data, i.e., if we delete/drop the external table. Still no impact on the external table data present on the HDFS. It will only drop the metadata associated with the table.
If we drop the internal or manage the table, the table DDL, metadata information, and table data will be lost. The table data is available on HDFS it will also lose. We should be very careful while dropping any internal or manage the table.
DDL Code for External Table
create external table emp.sales
(
idint,
first_name string,
last_name string,
gender string,
email_id string,
city string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
lines terminated by '\n'
location "/emp/sales"
tblproperties ("skip.header.line.count"="1");
Output:
How to modify/alter the Table?
Here we have the facility to alter or modify the existing attributes of the Table. With the help of the “alter” functionality, we can change the column name, add the column, drop the column, change the column name, and replace the column.
We can alter the below Table attributes.
1. Alter/ rename the tablename
Syntax:
ALTER TABLE [current table name] RENAME TO [new table name]
Query to Alter Table Name :
ALTER TABLE customer RENAME TO cust;
Output:
Before alter
After alter
2. Alter/ add column in the table
Syntax:
ALTER TABLE [current table name] ADD COLUMNS (column spec[, col_spec ...])
Query to add Column :
ALTER TABLE cust ADD COLUMNS (dept STRING COMMENT 'Department');
Output:
Sample view of the table
We are adding a new column in the table “department = dept”
3. Alter/change the column name
Syntax:
ALTER TABLE [current table name] CHANGE [column name][new name][new type]
Query to change column name :
ALTER TABLE cust CHANGE first_name name string;
Output:
Sample view of the customer table.
Now we are changing the column name “first_name” to “name”
How to drop the Table?
Here we can drop or delete the table from the system. When you will drop/delete the table from the hive database, the table entry will delete it from the hive metastore. If it is an internal table, the table, and data will completely delete. If it is an external table, then the table entry will delete it from metastore, but the data is available on HDFS Level.
Drop Internal or External Table
Syntax:
DROP TABLE [IF EXISTS] table name;
Drop Query:
drop table cust;
Output:
Before drop query run
After dropping the query, run on the “cust” table.
Conclusion
We have seen the uncut concept of “Hive Table” with the proper example, explanation, syntax, and SQL Query with different outputs. The table is useful for storing the structure data. The table data is helpful for various analysis purposes like BI, reporting, helpful/easy in data slicing and dicing, etc. The internal table is managed, and the hive does not manage the external table. We can choose the table type we need to create per the requirement.
Recommended Articles
We hope that this EDUCBA information on “Hive Table” was beneficial to you. You can view EDUCBA’s recommended articles for more information.