Ah, that's where it was. Ok, so if I understand correctly, individual buffers are compressed, and in the Buffer struct, the buffer length is the _compressed_ length? And when written, the _uncompressed_ length is first written in 8 bytes, then the compressed buffer?
What's the general strategy for dealing with compressed buffers? Uncompress the whole thing when deserializing a compressed buffer? Or is decompressing delayed until individual elements are accessed? I'm guessing the former since it doesn't seem like you'd be able to do random-access into a compressed buffer? -Jacob On Tue, Sep 15, 2020 at 6:23 PM Wes McKinney <wesmck...@gmail.com> wrote: > We have protocol-level compression for message body buffers [1][2] > with LZ4 or ZSTD > > In-memory compression and encoding other than dictionary encoding > (like RLE) has been discussed multiple times and remains on the > roadmap for the project. > > [1]: https://github.com/apache/arrow/blob/master/format/Message.fbs#L45 > > On Tue, Sep 15, 2020 at 7:18 PM Jacob Quinn <quinn.jac...@gmail.com> > wrote: > > > > Am I correct in understanding there's nothing in the arrow ipc/file > format > > spec about compression? I thought I had seen something at one point, but > > looking over the spec website, I don't see anything. > > > > -Jacob >