Page Comparison

Scroll ignore

scroll-viewport	false
scroll-pdf	true
scroll-office	false
scroll-chm	true
scroll-docbook	true
scroll-eclipsehelp	true
scroll-epub	true
scroll-html	false

Search this document:

Data Hub provides the ability to store and query large amounts of data processed by the system.

...

Data Hub requires access to Cloudera Impala, which provides high-performance, low-latency SQL queries on data stored in an Hadoop filesystem (HDFS).

The The Data Hub agent Forwarding Agent bulk loads data in CSV files to HDFS and then inserts it into a Parquet table in the Impala database specified by a Data Hub Profile. The table data is then available for query via the Image Removed Web UIvia Data Hub Query.

In a production environment, it is recommended that the size of the collected files ranges between 1 to 100 MB. Though it is possible to collect and process small batches the overhead of handling a large number of files will have significant impact on performance.

...

Cloudera Impala (https://www.cloudera.com)

Scroll ignore

scroll-viewport	false
scroll-pdf	true
scroll-office	false
scroll-chm	true
scroll-docbook	true
scroll-eclipsehelp	true
scroll-epub	true
scroll-html	false

This section contains the following:

Scroll pagebreak

Versions Compared

Old Version 6

New Version Current

Key