...
- Open a browser and and enter URL of the Hue interface.
- Create a staging directory.
- Open the file browser in Hue.
- Select a directory in the file browser, e.g. /user/impala/uploads
- Click the New button and then select directory.
- Enter the name of the new directory, e.g.
staging
and then click Create. - Select the directory in the file browser.
- Click the Actions button and then select Change Permissions.
- Update the permissions to make the new directory available to the UNIX user(s) that is used to start the ECs.
- Create a database and a table to be used by Data Hub.
- Select Impala from Query Editors.
Enter a CREATE DATBASE DATABASE statement in the editor and then click the Execute button.
Info title Example - Creating a database Code Block CREATE DATABASE test;
- Click the Refresh button.
Enter a CREATE TABLE statement in the editor and then click the Execute button.
The CREATE TABLE statement may contain the following data types:
STRING
INT
FLOAT
DOUBLE
- BOOLEAN
BIGINT
REAL
SMALINT
TINYINT
TIMESTAMP
Note title Note! A
PARTIONED BY
clause is optional. However, it is highly recommended since it will improve the performance of queries that restrict results by the partitioned column. A partition column ofINT
type also make it possible to use the Data Hub task agent to automatically remove old data from the table. For further information about the Data Hub task agent, see
Data Hub is limited to handle one partition column.
A
STORED AS PARQUET
clause is required. If you omit this clause, Data Hub will fail to update the table.Info title Example - Creating a table in Impala Code Block CREATE TABLE IF NOT EXISTS mytable (id STRING, start BIGINT, stop BIGINT) PARTITIONED BY (yearmonthday INT) STORED AS PARQUET TBLPROPERTIES ('transactional'='false');
When you run the Data Hub agent, temporary tables will be created in the same schema. These table will be visible in Hue but hidden in
the system.