>
> 1) contribute the missing support ourselves

I actually think we might need to proceed with this option.  Even more
unfortunate, is I think the best place at the moment for the contribution
to live is within Arrow.  Fortunately, i think a port of the existing
Apache Commons library for off-heap use should be relatively easy.  We can
reach out to Apache Commons to see if they would be interested in this
contribution but I would guess not, since I don't think there is a lot off
off-heap logic in the library in general (but my knowledge is stale here).

2) use another LZ4 library for Java


We are using the only library I could find that seems to have full support
for LZ4 Frame data.  Unfortunately it is purely on-heap which I believe is
the source of the performance problems.

On Wed, Mar 17, 2021 at 7:15 AM Antoine Pitrou <anto...@python.org> wrote:

>
> If you look at
>
> https://github.com/lz4/lz4-java/graphs/contributors?from=2019-12-28&to=2021-03-17&type=c,
>
> lz4-java seems to be receiving very little maintenance.  So I think
> there are two possible avenues:
>
> 1) contribute the missing support ourselves
> 2) use another LZ4 library for Java
>
> Solution #2 seems more reasonable to me.
>
> Regards
>
> Antoine.
>
>
> Le 11/03/2021 à 21:05, Micah Kornfield a écrit :
> > FYI, I opened up https://github.com/lz4/lz4-java/issues/176 to discuss
> > support for dependent frames.
> >
> > On Thu, Mar 11, 2021 at 11:59 AM David Li <lidav...@apache.org> wrote:
> >
> >> At least for Flight, I don't think we'd use that. Right now the way
> >> compression is supported is the same way as with Feather, i.e. the body
> >> buffers in each individual record batch sent on the wire are compressed,
> >> but not the stream as a whole. (And so far we haven't found a compelling
> >> benefit for compression in Flight in general.)
> >>
> >> Best,
> >> David
> >>
> >> On Thu, Mar 11, 2021, at 14:34, Antoine Pitrou wrote:
> >>>
> >>> Le 11/03/2021 à 19:54, Micah Kornfield a écrit :
> >>>>>
> >>>>> Indeed, I don't think it was discussed publicly.  The LZ4 frame
> format
> >>>>> has several things going for it:
> >>>>> - it allows streaming compression and decompression (meaning you can
> >>>>> avoid loading a huge compressed buffer at once)
> >>>>
> >>>> Is this something we make use of or intend to make use of?
> >>>
> >>> Good question.  Currently we don't.  Perhaps David Li wants to answer
> >>> this, since he's been working a lot on Flight.
> >>>
> >>>>> - it embeds the decompressed size, allowing exact allocation of the
> >>>>> decompressed buffer
> >>>>
> >>>> IIUC, We already do this in the IPC specification (the first 8 bytes
> >> of the
> >>>> compressed buffer are used for this).
> >>>
> >>> Ah, you're right.  It doesn't matter then.
> >>>
> >>>> - it has an optional checksum
> >>>>
> >>>> This seems like a good thing, so probably worth keeping (although it
> >> would
> >>>> be the only place where we do checksums today).
> >>>
> >>> (or of course we could add an optional higher-level checksum in the IPC
> >>> format)
> >>>
> >>> Regards
> >>>
> >>> Antoine.
> >>>
> >>
> >
>

Reply via email to