Thanks Wes. Will do. On Sat, May 16, 2020 at 7:06 PM Wes McKinney <wesmck...@gmail.com> wrote:
> hi Amol, > > thanks for pointing that out. Such a heuristic (observing compression > ratios of stream messages) could be implemented at some point so that > compression could be toggled off mid-stream if it doesn't seem to be > helping. Feel free to open a JIRA issue about this > > - Wes > > On Sat, May 16, 2020 at 6:39 AM Amol Umbarkar <amolumbar...@gmail.com> > wrote: > > > > Hello All, > > I was going through dask developer log recently. Dask seems to be > > selectively do compression if it is found to be useful. They sort of pick > > 10kb of sample upfront to calculate compression and if the results are > good > > then the whole batch is compressed. This seems to save de-compression > > effort on receiver side. > > > > Please take a look at > > > https://blog.dask.org/2016/04/14/dask-distributed-optimizing-protocol#problem-3-unwanted-compression > > > > Thought this could be relevant to arrow batch transfers as well. > > > > Thanks, > > Amol > > > > On Thu, Apr 23, 2020 at 5:54 AM Wes McKinney <wesmck...@gmail.com> > wrote: > > > > > Hello, > > > > > > I have proposed adding a simple RecordBatch IPC message body > > > compression scheme (using either LZ4 or ZSTD) to the Arrow IPC > > > protocol in GitHub PR [1] as discussed on the mailing list [2]. This > > > is distinct from separate discussions about adding in-memory encodings > > > (like RLE-encoding) to the Arrow columnar format. > > > > > > This change is not forward compatible so it will not be safe to send > > > compressed messages to old libraries, but since we are still pre-1.0.0 > > > the consensus is that this is acceptable. We may separately consider > > > increasing the metadata version for 1.0.0 to require clients to > > > upgrade. > > > > > > Please vote whether to accept the addition. The vote will be open for > > > at least 72 hours. > > > > > > [ ] +1 Accept this addition to the IPC protocol > > > [ ] +0 > > > [ ] -1 Do not accept the changes because... > > > > > > Here is my vote: +1 > > > > > > Thanks, > > > Wes > > > > > > [1]: https://github.com/apache/arrow/pull/6707 > > > [2]: > > > > https://lists.apache.org/thread.html/r58c9d23ad159644fca590d8f841df80d180b11bfb72f949d601d764b%40%3Cdev.arrow.apache.org%3E > > > >