Hi, thanks for all your great suggestions. 

Seems like we are having an alignment on adding a codec::options class instead 
of passing extra parameters in existing function calls. A virtual base class 
with derived classes sounds like a good idea to add specific codec options for 
different codecs and avoid misuse of unsupported properties. But just need to 
implement CodecOptions for each Codec.

May I draft a PR or design doc first based on our discussion for your review? 
Or do we need more discussions/votes on the details before jumping to the code? 
Thanks!

Best,
Yang

-----Original Message-----
From: Antoine Pitrou <anto...@python.org> 
Sent: Sunday, April 23, 2023 11:03 PM
To: dev@arrow.apache.org
Subject: Re: [DISCUSS][C++][Parquet] Expose the API to customize the 
compression parameter


The most idiomatic option seems to be a virtual base class with derived classes 
as required:

```
class CodecOptions {
  public:
   virtual ~CodecOptions() = default;

   int compression_level;
};

class ZlibCodecOptions : public CodecOptions {
  public:
   ~CodecOptions() = default;

   int window_bits;
};

class Codec {
  public:
   static Result<std::unique_ptr<Codec>> Create(
       Compression::type codec, const CodecOptions* = nullptr); }; ```


Le 23/04/2023 à 16:50, Gang Wu a écrit :
> It is a good idea to extend the Codec factory to offer more options.
> However, I don't think adding
> a `window_bits` parameter as `compression_level` is a good approach as 
> it does not apply to some codecs.
> 
> IMO, the proposed new `Codec::Options` can be as simple as a 
> std::map<std::string, std::string>.
> To avoid misuse, we need to formalize all acceptable keys somewhere 
> and specific codecs only need to deal with config keys that they know.
> 
> WDYT?
> 
> Best,
> Gang
> 
> On Sun, Apr 23, 2023 at 5:38 PM Yang, Yang10 <yang10.y...@intel.com> wrote:
> 
>> Hi,
>>
>> As discussed in this issue: 
>> https://github.com/apache/arrow/issues/35287,
>> currently Arrow only supports one parameter: compression_level to be 
>> customized. We would like to make more compression parameters (such 
>> as
>> window_bits) customizable when creating the Codec, given the variety 
>> of usage scenarios. As suggested by @kou, we may introduce a new 
>> options class such as Codec::Options to make the structure clear and 
>> easy to extend. But it may take some effort as this is more like a code 
>> structure refactor.
>> Passing a parameter directly is another approach, easy to implement 
>> but may be hard to extend. So we would like some further discussion 
>> here. If you have any suggestion or comments, please share them on above 
>> issue or here.
>> Thanks!
>>
>> Best,
>> Yang
>>
> 

Reply via email to