Hi, thanks for all your great suggestions. Seems like we are having an alignment on adding a codec::options class instead of passing extra parameters in existing function calls. A virtual base class with derived classes sounds like a good idea to add specific codec options for different codecs and avoid misuse of unsupported properties. But just need to implement CodecOptions for each Codec.
May I draft a PR or design doc first based on our discussion for your review? Or do we need more discussions/votes on the details before jumping to the code? Thanks! Best, Yang -----Original Message----- From: Antoine Pitrou <anto...@python.org> Sent: Sunday, April 23, 2023 11:03 PM To: dev@arrow.apache.org Subject: Re: [DISCUSS][C++][Parquet] Expose the API to customize the compression parameter The most idiomatic option seems to be a virtual base class with derived classes as required: ``` class CodecOptions { public: virtual ~CodecOptions() = default; int compression_level; }; class ZlibCodecOptions : public CodecOptions { public: ~CodecOptions() = default; int window_bits; }; class Codec { public: static Result<std::unique_ptr<Codec>> Create( Compression::type codec, const CodecOptions* = nullptr); }; ``` Le 23/04/2023 à 16:50, Gang Wu a écrit : > It is a good idea to extend the Codec factory to offer more options. > However, I don't think adding > a `window_bits` parameter as `compression_level` is a good approach as > it does not apply to some codecs. > > IMO, the proposed new `Codec::Options` can be as simple as a > std::map<std::string, std::string>. > To avoid misuse, we need to formalize all acceptable keys somewhere > and specific codecs only need to deal with config keys that they know. > > WDYT? > > Best, > Gang > > On Sun, Apr 23, 2023 at 5:38 PM Yang, Yang10 <yang10.y...@intel.com> wrote: > >> Hi, >> >> As discussed in this issue: >> https://github.com/apache/arrow/issues/35287, >> currently Arrow only supports one parameter: compression_level to be >> customized. We would like to make more compression parameters (such >> as >> window_bits) customizable when creating the Codec, given the variety >> of usage scenarios. As suggested by @kou, we may introduce a new >> options class such as Codec::Options to make the structure clear and >> easy to extend. But it may take some effort as this is more like a code >> structure refactor. >> Passing a parameter directly is another approach, easy to implement >> but may be hard to extend. So we would like some further discussion >> here. If you have any suggestion or comments, please share them on above >> issue or here. >> Thanks! >> >> Best, >> Yang >> >