Parquet Profile Configuration Advanced

The Advanced tab includes additional properties for optimizing the performance of the Parquet encoding. Note that these parameters are only used by the Parquet Encoder. None of these are choices when decoding.

 

The Parquet profile's Advanced tab

Setting

Description

Setting

Description

Compression Codec

The compression algorithm used to compress pages when encoding. Valid choices are Uncompressed, Snappy, Gzip, Lzo, Brotli, Lz4, and Zstd. Default is Uncompressed.

Block Size

The Block Size is the size of a row group buffered in memory. Block size limits the memory usage when writing. Larger values improve I/O when reading, but consumes more memory when writing. The Default Block Size is 134217728.

Page Size

The page is the smallest unit that must be read fully to access a single record. When reading, each page can be decompressed independently. If this value is too small, the compression deteriorates. The Default Page Size is 1048576.

Dictionary Page Size

There is one dictionary page per column per row group when dictionary encoding is used. Dictionary page size works like the page size, but for dictionary encoding. Default is 1048576.

Enable Dictionary

Select this checkbox to include the dictionary compression strategy in the generated Parquet document. Enable Dictionary allows for building a dictionary of values encountered in columns.

Validating

Select this checkbox to enable schema validation.

Writer Version

Parquet format version to use when writing. Version 1.0 (v1) ensures compatibility with older readers. Default is v1. There is also the option to set v2.