In the Advanced tab you can configure additional properties for optimizing the performance of the Parquet. Depending on selected ....
Setting | Description |
---|---|
Compression Codec | The compression algorithm used to compress pages when encoding. Valid choices are Uncompressed, Snappy, Gzip, Lzo, Brotli, Lz4, and Zstd. Default is Uncompressed. |
Block Size | The Block Size is the size of a row group buffered in memory. Block size limits the memory usage when writing. Larger values will improve I/O when reading but consume more memory when writing. The Default Block Size is set to 134217728. |
Page Size | The page is the smallest unit that must be read fully to access a single record. When reading, each page can be decompressed independently. If this value is too small, the compression will deteriorate. The Default Page Size is set to 1048576. |
Dictionary Page Size | There is one dictionary page per column per row group when dictionary encoding is used. Dictionary page size works like the page size but for dictionary encoding. Default is 1048576 |
Enable Dictionary | Select this check box to enable to include the dictionary compression strategy in the generated parquet document. Enabling dictionary allows for building a dictionary of values encountered in columns. |
Validating | Select this check box to enable schema validation. |
Writer Version | Parquet format version to use when writing. 1.0 ensures compatibility with older readers. Default is v1. The option is to set v2 |