Hi, Antoine

Do you mean the performance data HW-GZIP compared with LZ4/ZSTD? 

Thanks,
XieQi

-----Original Message-----
From: Antoine Pitrou <anto...@python.org> 
Sent: Tuesday, October 20, 2020 10:38 PM
To: dev@arrow.apache.org; Xie, Qi <qi....@intel.com>
Cc: Xu, Cheng A <cheng.a...@intel.com>; Dong, Xin <xin.d...@intel.com>; Zhang, 
Jie1 <jie1.zh...@intel.com>
Subject: Re: [Discuss] Provide pluggable APIs to support user customized 
compression codec



Le 20/10/2020 à 12:09, Xie, Qi a écrit :
> Hi, Wes
> 
> Yes currently the purpose of the key-value metadata is just a hint to 
> indicate that the parquet file is compressed by plugin so that the parquet 
> reader can load the plugin library and use plugin to decompress the file.
> There are many optimized GZIP implementations and may not compatible with the 
> standard gzip, for example due to hardware limit, the HW-GZIP history window 
> size maybe smaller than the standard gzip, so that HW-GZIP can't decompress 
> the file compressed by standard gzip and because we are still use the 
> Compression::GZIP as Compression::type, we need that metadata to distinguish 
> it from the standard gzip.

What does it bring over ZSTD or LZ4 exactly?

Regards

Antoine.

Reply via email to