On Tue, Mar 24, 2020 at 9:22 PM Micah Kornfield <emkornfi...@gmail.com> wrote: > > > > > Compression ratios ranging from ~50% with LZ4 and ~75% with ZSTD on > > the Taxi dataset to ~87% with LZ4 and ~90% with ZSTD on the Fannie Mae > > dataset. So that's a huge space savings > > One more question on this. What was the average row-batch size used? I > see in the proposal some buffers might not be compressed, did you this > feature in the test?
I used 64K row batch size. I haven't implemented the optional non-compressed buffers (for cases where there is little space savings) so everything is compressed. I can check different batch sizes if you like > On Mon, Mar 23, 2020 at 4:40 PM Wes McKinney <wesmck...@gmail.com> wrote: > > > hi folks, > > > > Sorry it's taken me a little while to produce supporting benchmarks. > > > > * I implemented experimental trivial body buffer compression in > > https://github.com/apache/arrow/pull/6638 > > * I hooked up the Arrow IPC file format with compression as the new > > Feather V2 format in > > https://github.com/apache/arrow/pull/6694#issuecomment-602906476 > > > > I tested a couple of real-world datasets from a prior blog post > > https://ursalabs.org/blog/2019-10-columnar-perf/ with ZSTD and LZ4 > > codecs > > > > The complete results are here > > https://github.com/apache/arrow/pull/6694#issuecomment-602906476 > > > > Summary: > > > > * Compression ratios ranging from ~50% with LZ4 and ~75% with ZSTD on > > the Taxi dataset to ~87% with LZ4 and ~90% with ZSTD on the Fannie Mae > > dataset. So that's a huge space savings > > * Single-threaded decompression times exceeding 2-4GByte/s with LZ4 > > and 1.2-3GByte/s with ZSTD > > > > I would have to do some more engineering to test throughput changes > > with Flight, but given these results on slower networking (e.g. 1 > > Gigabit) my guess is that the compression and decompression overhead > > is little compared with the time savings due to high compression > > ratios. If people would like to see these numbers to help make a > > decision I can take a closer look > > > > As far as what Micah said about having a limited number of > > compressors: I would be in favor of having just LZ4 and ZSTD. It seems > > anecdotally that these outperform Snappy in most real world scenarios > > and generally have > 1 GB/s decompression performance. Some Linux > > distributions (Arch at least) have already started adopting ZSTD over > > LZMA or GZIP [1] > > > > - Wes > > > > [1]: > > https://www.archlinux.org/news/now-using-zstandard-instead-of-xz-for-package-compression/ > > > > On Fri, Mar 6, 2020 at 8:42 AM Fan Liya <liya.fa...@gmail.com> wrote: > > > > > > Hi Wes, > > > > > > Thanks a lot for the additional information. > > > Looking forward to see the good results from your experiments. > > > > > > Best, > > > Liya Fan > > > > > > On Thu, Mar 5, 2020 at 11:42 PM Wes McKinney <wesmck...@gmail.com> > > wrote: > > > > > > > I see, thank you. > > > > > > > > For such a scenario, implementations would need to define a > > > > "UserDefinedCodec" interface to enable codecs to be registered from > > > > third party code, similar to what is done for extension types [1] > > > > > > > > I'll update this thread when I get my experimental C++ patch up to see > > > > what I'm thinking at least for the built-in codecs we have like ZSTD. > > > > > > > > > > > > > > https://github.com/apache/arrow/blob/apache-arrow-0.16.0/docs/source/format/Columnar.rst#extension-types > > > > > > > > On Thu, Mar 5, 2020 at 7:56 AM Fan Liya <liya.fa...@gmail.com> wrote: > > > > > > > > > > Hi Wes, > > > > > > > > > > Thanks a lot for your further clarification. > > > > > > > > > > Some of my prelimiary thoughts: > > > > > > > > > > 1. We assign a unique GUID to each pair of compression/decompression > > > > > strategies. The GUID is stored as part of the > > Message.custom_metadata. > > > > When > > > > > receiving the GUID, the receiver knows which decompression strategy > > to > > > > use. > > > > > > > > > > 2. We serialize the decompression strategy, and store it into the > > > > > Message.custom_metadata. The receiver can decompress data after > > > > > deserializing the strategy. > > > > > > > > > > Method 1 is generally used in static strategy scenarios while method > > 2 is > > > > > generally used in dynamic strategy scenarios. > > > > > > > > > > Best, > > > > > Liya Fan > > > > > > > > > > On Wed, Mar 4, 2020 at 11:39 PM Wes McKinney <wesmck...@gmail.com> > > > > wrote: > > > > > > > > > > > Okay, I guess my question is how the receiver is going to be able > > to > > > > > > determine how to "rehydrate" the record batch buffers: > > > > > > > > > > > > What I've proposed amounts to the following: > > > > > > > > > > > > * UNCOMPRESSED: the current behavior > > > > > > * ZSTD/LZ4/...: each buffer is compressed and written with an int64 > > > > > > length prefix > > > > > > > > > > > > (I'm close to putting up a PR implementing an experimental version > > of > > > > > > this that uses Message.custom_metadata to transmit the codec, so > > this > > > > > > will make the implementation details more concrete) > > > > > > > > > > > > So in the USER_DEFINED case, how will the library know how to > > obtain > > > > > > the uncompressed buffer? Is some additional metadata structure > > > > > > required to provide instructions? > > > > > > > > > > > > On Wed, Mar 4, 2020 at 8:05 AM Fan Liya <liya.fa...@gmail.com> > > wrote: > > > > > > > > > > > > > > Hi Wes, > > > > > > > > > > > > > > I am thinking of adding an option named "USER_DEFINED" (or > > something > > > > > > > similar) to enum CompressionType in your proposal. > > > > > > > IMO, this option should be used primarily in Flight. > > > > > > > > > > > > > > Best, > > > > > > > Liya Fan > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 11:12 AM Wes McKinney < > > wesmck...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > > On Tue, Mar 3, 2020, 8:11 PM Fan Liya <liya.fa...@gmail.com> > > > > wrote: > > > > > > > > > > > > > > > > > Sure. I agree with you that we should not overdo this. > > > > > > > > > I am wondering if we should provide an option to allow users > > to > > > > > > plugin > > > > > > > > > their customized compression strategies. > > > > > > > > > > > > > > > > > > > > > > > > > Can you provide a patch showing changes to Message.fbs (or > > > > Schema.fbs) > > > > > > that > > > > > > > > make this idea more concrete? > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Liya Fan > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 9:47 PM Wes McKinney < > > wesmck...@gmail.com > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020, 7:36 AM Fan Liya < > > liya.fa...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > I am so glad to see this discussion, and I am willing to > > > > provide > > > > > > help > > > > > > > > > > from > > > > > > > > > > > the Java side. > > > > > > > > > > > > > > > > > > > > > > In the proposal, I see the support for basic compression > > > > > > strategies > > > > > > > > > > > (e.g.gzip, snappy). > > > > > > > > > > > IMO, applying a single basic strategy is not likely to > > > > achieve > > > > > > > > > > performance > > > > > > > > > > > improvement for most scenarios. > > > > > > > > > > > The optimal compression strategy is often obtained by > > > > composing > > > > > > basic > > > > > > > > > > > strategies and tuning parameters. > > > > > > > > > > > > > > > > > > > > > > I hope we can support such highly customized compression > > > > > > strategies. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think very much beyond trivial one-shot buffer level > > > > compression > > > > > > is > > > > > > > > > > probably out of the question for addition to the current > > > > > > "RecordBatch" > > > > > > > > > > Flatbuffers type, because the additional metadata would add > > > > > > undesirable > > > > > > > > > > bloat (which I would be against). If people have other > > ideas it > > > > > > would > > > > > > > > be > > > > > > > > > > great to see exactly what you are thinking as far as > > changes > > > > to the > > > > > > > > > > protocol files. > > > > > > > > > > > > > > > > > > > > I'll try to assemble some examples to show the before/after > > > > > > results of > > > > > > > > > > applying the simple strategy. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > Liya Fan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 8:15 PM Antoine Pitrou < > > > > > > anto...@python.org> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If we want to use a HTTP header, it would be more of a > > > > > > > > > Accept-Encoding > > > > > > > > > > > > header, no? > > > > > > > > > > > > > > > > > > > > > > > > In any case, we would have to put non-standard values > > there > > > > > > (e.g. > > > > > > > > > lz4), > > > > > > > > > > > > so I'm not sure how desirable it is to repurpose HTTP > > > > headers > > > > > > for > > > > > > > > > that, > > > > > > > > > > > > rather than add some dedicated field to the Flight > > > > messages. > > > > > > > > > > > > > > > > > > > > > > > > Regards > > > > > > > > > > > > > > > > > > > > > > > > Antoine. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Le 03/03/2020 à 12:52, David Li a écrit : > > > > > > > > > > > > > gRPC supports headers so for Flight, we could send > > > > > > essentially an > > > > > > > > > > > Accept > > > > > > > > > > > > > header and perhaps a Content-Type header. > > > > > > > > > > > > > > > > > > > > > > > > > > David > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 2, 2020, 23:15 Micah Kornfield < > > > > > > > > emkornfi...@gmail.com> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > >> Hi Wes, > > > > > > > > > > > > >> A few thoughts on this. In general, I think it is a > > > > good > > > > > > idea. > > > > > > > > > But > > > > > > > > > > > > before > > > > > > > > > > > > >> proceeding, I think the following points are worth > > > > > > discussing: > > > > > > > > > > > > >> 1. Does this actually improve throughput/latency > > for > > > > > > Flight? (I > > > > > > > > > > think > > > > > > > > > > > > you > > > > > > > > > > > > >> mentioned you would follow-up with benchmarks). > > > > > > > > > > > > >> 2. I think we should limit the number of supported > > > > > > compression > > > > > > > > > > > schemes > > > > > > > > > > > > to > > > > > > > > > > > > >> only 1 or 2. I think the criteria for selection > > speed > > > > and > > > > > > > > native > > > > > > > > > > > > >> implementations available across the widest possible > > > > > > languages. > > > > > > > > > As > > > > > > > > > > > far > > > > > > > > > > > > as > > > > > > > > > > > > >> i can tell zstd only have bindings in java via JNI, > > but > > > > my > > > > > > > > > > > > understanding is > > > > > > > > > > > > >> it is probably the type of compression for our > > > > use-cases. > > > > > > So I > > > > > > > > > > think > > > > > > > > > > > > >> zstd + potentially 1 more. > > > > > > > > > > > > >> 3. Commitment from someone on the Java side to > > > > implement > > > > > > this. > > > > > > > > > > > > >> 4. This doesn't need to be coupled with this change > > > > per-se > > > > > > but > > > > > > > > > for > > > > > > > > > > > > >> something like flight it would be good to have a > > > > standard > > > > > > > > > mechanism > > > > > > > > > > > for > > > > > > > > > > > > >> negotiating server/client capabilities (e.g. client > > > > doesn't > > > > > > > > > support > > > > > > > > > > > > >> compression or only supports a subset). > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> Thanks, > > > > > > > > > > > > >> Micah > > > > > > > > > > > > >> > > > > > > > > > > > > >> On Sun, Mar 1, 2020 at 1:24 PM Wes McKinney < > > > > > > > > wesmck...@gmail.com> > > > > > > > > > > > > wrote: > > > > > > > > > > > > >> > > > > > > > > > > > > >>> On Sun, Mar 1, 2020 at 3:14 PM Antoine Pitrou < > > > > > > > > > anto...@python.org> > > > > > > > > > > > > >> wrote: > > > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> Le 01/03/2020 à 22:01, Wes McKinney a écrit : > > > > > > > > > > > > >>>>> In the context of a "next version of the Feather > > > > format" > > > > > > > > > > ARROW-5510 > > > > > > > > > > > > >>>>> (which is consumed only by Python and R at the > > > > moment), I > > > > > > > > have > > > > > > > > > > been > > > > > > > > > > > > >>>>> looking at compressing buffers using fast > > compressors > > > > > > like > > > > > > > > ZSTD > > > > > > > > > > > when > > > > > > > > > > > > >>>>> writing the RecordBatch bodies. This could be > > handled > > > > > > > > privately > > > > > > > > > > as > > > > > > > > > > > an > > > > > > > > > > > > >>>>> implementation detail of the Feather file, but > > since > > > > ZSTD > > > > > > > > > > > compression > > > > > > > > > > > > >>>>> could improve throughput in Flight, for example, > > I > > > > > > thought I > > > > > > > > > > would > > > > > > > > > > > > >>>>> bring it up for discussion. > > > > > > > > > > > > >>>>> > > > > > > > > > > > > >>>>> I can see two simple compression strategies: > > > > > > > > > > > > >>>>> > > > > > > > > > > > > >>>>> * Compress the entire message body in one-shot, > > > > writing > > > > > > the > > > > > > > > > > result > > > > > > > > > > > > >> out > > > > > > > > > > > > >>>>> with an 8-byte int64 prefix indicating the > > > > uncompressed > > > > > > size > > > > > > > > > > > > >>>>> * Compress each non-zero-length constituent > > Buffer > > > > prior > > > > > > to > > > > > > > > > > writing > > > > > > > > > > > > >> to > > > > > > > > > > > > >>>>> the body (and using the same > > > > uncompressed-length-prefix > > > > > > when > > > > > > > > > > > writing > > > > > > > > > > > > >>>>> the compressed buffer) > > > > > > > > > > > > >>>>> > > > > > > > > > > > > >>>>> The latter strategy is preferable for scenarios > > > > where we > > > > > > may > > > > > > > > > > > project > > > > > > > > > > > > >>>>> out only a few fields from a larger record batch > > > > (such as > > > > > > > > > reading > > > > > > > > > > > > >> from > > > > > > > > > > > > >>>>> a memory-mapped file). > > > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> Agreed. It may also allow using different > > compression > > > > > > > > > strategies > > > > > > > > > > > for > > > > > > > > > > > > >>>> different kinds of buffers (for example a > > bytestream > > > > > > splitting > > > > > > > > > > > > strategy > > > > > > > > > > > > >>>> for floats and doubles, or a delta encoding > > strategy > > > > for > > > > > > > > > > integers). > > > > > > > > > > > > >>> > > > > > > > > > > > > >>> If we wanted to allow for different compression to > > > > apply to > > > > > > > > > > different > > > > > > > > > > > > >>> buffers, I think we will need a new Message type > > > > because > > > > > > this > > > > > > > > > would > > > > > > > > > > > > >>> inflate metadata sizes in a way that is not likely > > to > > > > be > > > > > > > > > acceptable > > > > > > > > > > > > >>> for the current uncompressed use case. > > > > > > > > > > > > >>> > > > > > > > > > > > > >>> Here is my strawman proposal > > > > > > > > > > > > >>> > > > > > > > > > > > > >>> > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/arrow/compare/master...wesm:compression-strawman > > > > > > > > > > > > >>> > > > > > > > > > > > > >>>>> Implementation could be accomplished by one of > > the > > > > > > following > > > > > > > > > > > methods: > > > > > > > > > > > > >>>>> > > > > > > > > > > > > >>>>> * Setting a field in Message.custom_metadata > > > > > > > > > > > > >>>>> * Adding a new field to Message > > > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> I think it has to be a new field in Message. > > Making > > > > it an > > > > > > > > > > ignorable > > > > > > > > > > > > >>>> metadata field means non-supporting receivers will > > > > decode > > > > > > and > > > > > > > > > > > > interpret > > > > > > > > > > > > >>>> the data wrongly. > > > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> Regards > > > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> Antoine. > > > > > > > > > > > > >>> > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >