>
> Could you share the benchmark code/how the benchmark was run (does this
> account for JIT warm-up time)?
I just used the benchmark by the aircompressor project. They run the
benchmark for a lot of algorithms on a lot of datasets so I commented out
some to get faster results. You can find my ve
>
> I executed some of the benchmarks in the airlift/aircompressor project. I
> found that aircompressior achieves on average only about 72%
> throughput compared to the current version of the lz4-java JNI bindings
> when compressing. When decompressing the gap is even bigger with around 56%
> thro
Le 22/03/2021 à 15:29, Benjamin Wilhelm a écrit :
Also, I would like to resume the discussion about the Frame format vs the
Block format. There were 3 points for the Frame format by Antoine:
- it allows streaming compression and decompression (meaning you can
avoid loading a huge compressed b
I executed some of the benchmarks in the airlift/aircompressor project. I
found that aircompressior achieves on average only about 72%
throughput compared to the current version of the lz4-java JNI bindings
when compressing. When decompressing the gap is even bigger with around 56%
throughout. See
>
> I would start looking into the JNI approach. Contributing back
> to lz4-java or adding this to Arrow.
A first step might be to compare the performance of the JNI approach vs
Airlift. The airlift library only uses Java and claims to be potentially
faster. A JNI approach has the downside of r
>
> > 1) contribute the missing support ourselves
> I actually think we might need to proceed with this option.
I agree. I am willing to help with this and explore and try different
approaches. I would start looking into the JNI approach. Contributing back
to lz4-java or adding this to Arrow.
Be
>
> 1) contribute the missing support ourselves
I actually think we might need to proceed with this option. Even more
unfortunate, is I think the best place at the moment for the contribution
to live is within Arrow. Fortunately, i think a port of the existing
Apache Commons library for off-hea
If you look at
https://github.com/lz4/lz4-java/graphs/contributors?from=2019-12-28&to=2021-03-17&type=c,
lz4-java seems to be receiving very little maintenance. So I think
there are two possible avenues:
1) contribute the missing support ourselves
2) use another LZ4 library for Java
Solut
FYI, I opened up https://github.com/lz4/lz4-java/issues/176 to discuss
support for dependent frames.
On Thu, Mar 11, 2021 at 11:59 AM David Li wrote:
> At least for Flight, I don't think we'd use that. Right now the way
> compression is supported is the same way as with Feather, i.e. the body
>
At least for Flight, I don't think we'd use that. Right now the way compression
is supported is the same way as with Feather, i.e. the body buffers in each
individual record batch sent on the wire are compressed, but not the stream as
a whole. (And so far we haven't found a compelling benefit fo
Le 11/03/2021 à 19:54, Micah Kornfield a écrit :
Indeed, I don't think it was discussed publicly. The LZ4 frame format
has several things going for it:
- it allows streaming compression and decompression (meaning you can
avoid loading a huge compressed buffer at once)
Is this something we m
>
> Indeed, I don't think it was discussed publicly. The LZ4 frame format
> has several things going for it:
> - it allows streaming compression and decompression (meaning you can
> avoid loading a huge compressed buffer at once)
Is this something we make use of or intend to make use of?
> - it
"Is https://github.com/lz4/lz4-java the fast Java lz4 library in
question? The incompleteness of this implementation is a known problem
for other user communities, not only Arrow. It would be a great public
service to improve it so that it fully implements the lz4 frame
specification."
Very much +
I prefer the lz4 frame format for the reasons that Antoine stated.
To be friendly to users, the Arrow IPC documentation could mention
that lz4 compression may break Java interoperability. If block
dependency is the only obstacle to Java interoperability, the Arrow
IPC implementation could disable
Le 11/03/2021 à 17:58, Micah Kornfield a écrit :
We've found in the process of implementing support for LZ4 decompression
that the fast Java decoder library does not support all the features of the
C++ library (dependendent blocks can't be read, and by default that is what
the C++ code emits).
What about the JNI bindings for lz4-c?
Le 11/03/2021 à 18:20, Micah Kornfield a écrit :
I looked a little closer and it looks like it only supports Block format
(in the code I didn't couldn't find any references to Frame).
On Thu, Mar 11, 2021 at 9:16 AM Antoine Pitrou wrote:
Have you tr
I looked a little closer and it looks like it only supports Block format
(in the code I didn't couldn't find any references to Frame).
On Thu, Mar 11, 2021 at 9:16 AM Antoine Pitrou wrote:
>
> Have you tried another Java LZ4 library (I think you mentioned Airlift
> on a PR)?
>
>
> Le 11/03/2021
Have you tried another Java LZ4 library (I think you mentioned Airlift
on a PR)?
Le 11/03/2021 à 17:58, Micah Kornfield a écrit :
We've found in the process of implementing support for LZ4 decompression
that the fast Java decoder library does not support all the features of the
C++ library
We've found in the process of implementing support for LZ4 decompression
that the fast Java decoder library does not support all the features of the
C++ library (dependendent blocks can't be read, and by default that is what
the C++ code emits). The only library we found (Apache Commons) that seem
19 matches
Mail list logo