[ 
https://issues.apache.org/jira/browse/IMPALA-13923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17949771#comment-17949771
 ] 

ASF subversion and git services commented on IMPALA-13923:
----------------------------------------------------------

Commit b9419ee32c98e95b5f1ea378624562673ead35be in impala's branch 
refs/heads/master from Surya Hebbar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b9419ee32 ]

IMPALA-13923: Support more compression levels for ZSTD and ZLIB

This patch adds support for more compression levels for ZLIB, ZSTD
and BZIP2.

The following additional compression levels are now supported.

For ZSTD,
  ZSTD_minCLevel(-ZSTD_TARGETLENGTH_MAX) to ZSTD_maxCLevel(20)

For ZLIB i.e. ZLIB, GZIP and DEFLATE,
  Z_DEFAULT_COMPRESSION(1) to Z_BEST_COMPRESSION(9)

For BZIP2 i.e. ZLIB, GZIP and DEFLATE,
  BlockSize100k * (1) to BlockSize100k * (9)

Note:
Currently, BZIP2 is only used by TmpFileMgr. It is not supported
by Parquet(i.e. for writing tables).

These are now supported with the "compression_codec" query option.

This has been implemented by refactoring compression levels as an
optional parameter in CodecInfo.

For ZSTD, negative compression levels are now supported IMPALA-10630.

Usage of compression level has been refactored with std::optional in
- exec/parquet/hdfs-parquet-table-writer
- runtime/tmp-file-mgr
- service/query-options
- util/codec
- util/compress

To validate compression levels externally, the following method has
been added
- Status Codec::ValidateCompressionLevel

Added new tests for -
  * Additional compression levels for ZLIB, ZSTD and BZIP2
  * Query option - "compression_codec" for the newly added formats
    and compression levels

The following tests were executed to verify codecs and compression levels.
- DecompressorTest.ZSTD*
- DecompressorTest.Gzip
- DecompressorTest.Bzip
- QueryOptions.CompressionCodec
- TestComputeStats::test_compute_stats_compression_codec

For the stored Parquet, manually verified the compression codec used for
ZSTD and ZLIB.

Change-Id: I5b98c735246f08e04598a4e752c8cca04e31a88a
Reviewed-on: http://gerrit.cloudera.org:8080/22718
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Michael Smith <[email protected]>
Reviewed-by: Joe McDonnell <[email protected]>


> Support more compression levels for ZLIB and ZSTD
> -------------------------------------------------
>
>                 Key: IMPALA-13923
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13923
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Surya Hebbar
>            Assignee: Surya Hebbar
>            Priority: Major
>
> Support compression levels for GZIP, while possibly refactoring existing 
> tests and implementation of compression, while also possibly trying to 
> support resolve, IMPALA-10630 for ZSTD in later changes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to