Hi, Antoine Do you mean the performance data HW-GZIP compared with LZ4/ZSTD?
Thanks, XieQi -----Original Message----- From: Antoine Pitrou <anto...@python.org> Sent: Tuesday, October 20, 2020 10:38 PM To: dev@arrow.apache.org; Xie, Qi <qi....@intel.com> Cc: Xu, Cheng A <cheng.a...@intel.com>; Dong, Xin <xin.d...@intel.com>; Zhang, Jie1 <jie1.zh...@intel.com> Subject: Re: [Discuss] Provide pluggable APIs to support user customized compression codec Le 20/10/2020 à 12:09, Xie, Qi a écrit : > Hi, Wes > > Yes currently the purpose of the key-value metadata is just a hint to > indicate that the parquet file is compressed by plugin so that the parquet > reader can load the plugin library and use plugin to decompress the file. > There are many optimized GZIP implementations and may not compatible with the > standard gzip, for example due to hardware limit, the HW-GZIP history window > size maybe smaller than the standard gzip, so that HW-GZIP can't decompress > the file compressed by standard gzip and because we are still use the > Compression::GZIP as Compression::type, we need that metadata to distinguish > it from the standard gzip. What does it bring over ZSTD or LZ4 exactly? Regards Antoine.