What about the JNI bindings for lz4-c?


Le 11/03/2021 à 18:20, Micah Kornfield a écrit :
I looked a little closer and it looks like it only supports Block format
(in the code I didn't couldn't find any references to Frame).

On Thu, Mar 11, 2021 at 9:16 AM Antoine Pitrou <anto...@python.org> wrote:


Have you tried another Java LZ4 library (I think you mentioned Airlift
on a PR)?


Le 11/03/2021 à 17:58, Micah Kornfield a écrit :
We've found in the process of implementing support for LZ4 decompression
that the fast Java decoder library does not support all the features of
the
C++ library (dependendent blocks can't be read, and by default that is
what
the C++ code emits).  The only library we found (Apache Commons) that
seems
to support the full specification is unusably slow because it doesn't
directly support off-heap data.

I don't recall seeing a discussion on the merits of using LZ4 Frame vs
LZ4
Block compression in the Arrow IPC format, so I'm not sure if there is a
strong rationale for one versus the other.

At this point I think for interoperability we have three options:
1.  Specify in the specification that "independent" blocks must be used
for
LZ4_FRAME.
2.  Add LZ4_BLOCK to the specification and prefer that over LZ4_FRAME
3.  Provide our own  Java implementation (either directly in Arrow or by
providing a patch to another project) that supports dependent blocks.

Any thoughts?

Thanks,
Micah



Reply via email to