> > 1) contribute the missing support ourselves
I actually think we might need to proceed with this option. Even more unfortunate, is I think the best place at the moment for the contribution to live is within Arrow. Fortunately, i think a port of the existing Apache Commons library for off-heap use should be relatively easy. We can reach out to Apache Commons to see if they would be interested in this contribution but I would guess not, since I don't think there is a lot off off-heap logic in the library in general (but my knowledge is stale here). 2) use another LZ4 library for Java We are using the only library I could find that seems to have full support for LZ4 Frame data. Unfortunately it is purely on-heap which I believe is the source of the performance problems. On Wed, Mar 17, 2021 at 7:15 AM Antoine Pitrou <anto...@python.org> wrote: > > If you look at > > https://github.com/lz4/lz4-java/graphs/contributors?from=2019-12-28&to=2021-03-17&type=c, > > lz4-java seems to be receiving very little maintenance. So I think > there are two possible avenues: > > 1) contribute the missing support ourselves > 2) use another LZ4 library for Java > > Solution #2 seems more reasonable to me. > > Regards > > Antoine. > > > Le 11/03/2021 à 21:05, Micah Kornfield a écrit : > > FYI, I opened up https://github.com/lz4/lz4-java/issues/176 to discuss > > support for dependent frames. > > > > On Thu, Mar 11, 2021 at 11:59 AM David Li <lidav...@apache.org> wrote: > > > >> At least for Flight, I don't think we'd use that. Right now the way > >> compression is supported is the same way as with Feather, i.e. the body > >> buffers in each individual record batch sent on the wire are compressed, > >> but not the stream as a whole. (And so far we haven't found a compelling > >> benefit for compression in Flight in general.) > >> > >> Best, > >> David > >> > >> On Thu, Mar 11, 2021, at 14:34, Antoine Pitrou wrote: > >>> > >>> Le 11/03/2021 à 19:54, Micah Kornfield a écrit : > >>>>> > >>>>> Indeed, I don't think it was discussed publicly. The LZ4 frame > format > >>>>> has several things going for it: > >>>>> - it allows streaming compression and decompression (meaning you can > >>>>> avoid loading a huge compressed buffer at once) > >>>> > >>>> Is this something we make use of or intend to make use of? > >>> > >>> Good question. Currently we don't. Perhaps David Li wants to answer > >>> this, since he's been working a lot on Flight. > >>> > >>>>> - it embeds the decompressed size, allowing exact allocation of the > >>>>> decompressed buffer > >>>> > >>>> IIUC, We already do this in the IPC specification (the first 8 bytes > >> of the > >>>> compressed buffer are used for this). > >>> > >>> Ah, you're right. It doesn't matter then. > >>> > >>>> - it has an optional checksum > >>>> > >>>> This seems like a good thing, so probably worth keeping (although it > >> would > >>>> be the only place where we do checksums today). > >>> > >>> (or of course we could add an optional higher-level checksum in the IPC > >>> format) > >>> > >>> Regards > >>> > >>> Antoine. > >>> > >> > > >