Hi François, Thanks so much for the very detailed explanation, and that makes sense to me. I will check out the links for more information.
@Wes, ARROW-8250 is very useful to me as well and I will keep an eye on it. Thanks. On Wed, Jun 24, 2020 at 11:08 PM Wes McKinney <wesmck...@gmail.com> wrote: > See also this JIRA regarding adding random access read APIs for IPC > files (and thus Feather) > > https://issues.apache.org/jira/browse/ARROW-8250 > > I hope to see this implemented someday. > > On Wed, Jun 24, 2020 at 10:03 AM Francois Saint-Jacques > <fsaintjacq...@gmail.com> wrote: > > > > I forgot to mention that you can see how this is glued in > > `feather::reader::Read` [1]. This makes it obvious that nothing is > > cached and everything is loaded in memory. > > > > François > > > > [1] > https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/feather.cc#L715-L723 > > > > On Wed, Jun 24, 2020 at 10:53 AM Francois Saint-Jacques > > <fsaintjacq...@gmail.com> wrote: > > > > > > Hello Yue, > > > > > > FeatherV2 is just a facade for the Arrow IPC file format. You can find > > > the implementation here [1]. I will try to answer your question with > > > inline comments. On a high level, the file format writes a schema and > > > then multiple "chunks" called RecordBatch. Your lowest level of > > > granularity for fetching data is a RecordBatch [2]. Thus, a Table is > > > divided into multiple RecordBatch at write-time and the file stores a > > > series of said batches. When you read a file, you can either read the > > > whole table, or do point query on RecordBatch, e.g. > > > `RecordBatchFileReader::ReadRecordBatch(int i)`. If you use the > > > convenience API for reading the table in a single shot, e.g. > > > `feather::Reader::Read`, it will decompress all buffers and > > > materialize everything in memory. > > > > > > If you use compression, it means copying and decompressioning the > > > data. In other words, you'll have an RSS of the mmap size + > > > decompressed size. If you don't use compression, the buffers will be > > > zero-copy slices of the mmap-ed memory and *could* be lazily loaded > > > until pointers are dereferenced. But this assumes that the reader code > > > doesn't dereference, which might not always hold, e.g. sometimes we > > > call `{Array,RecordBatch,Table}::Validate` to ensure well formed > > > arrays. This method can > > > read the buffer for some types to validate that no segfault will > > > happen at runtime. > > > > > > IMHO, mmap and compression for the IPC file format are mutually > > > exclusive. If you use compression, you lose all the benefits of mmap > > > and you might as well disable mmap. If you want lazy loading and late > > > memory materialization (from disk), turn off compression. > > > > > > > 1) If a feather file contains multiple columns, are they compressed > > > > separately? I assume each column is compressed separately, and > instead of > > > > decompressing the entire feather file, only the accessed column will > be > > > > decompressed, is it correct? > > > > > > They are compressed separately [3]. The Reader will decompress all > > > columns of the requested batch. You can pass an option to limit the > > > number of columns [4] of interest. > > > > > > > 2) If a particular column value is randomly accessed via the column > array's > > > > index using mmap, will the entire column data be decompressed? I > assume > > > > only a portion of the column will be decompressed, is this correct? > > > > > > The entire column of the RecordBatch will be decompressed (and stored > > > in memory). If your table has a single RecordBatch, then yes the whole > > > column will be decompressed. > > > > > > > 3) If only part of the column is decompressed, what is the mechanism > for > > > > caching the decompressed data? For example, if we access 10 > > > > contiguous array values, do we need to decompress the column (or > part of > > > > the column) multiple times? What kind of access pattern could be not > > > > friendly to this cache mechanism? > > > > 4) If there is an internal caching mechanism, is there any way > > > > users/developers could tune the cache for different use scenarios, > for > > > > example, some fields may store large text data which may need bigger > cache. > > > > > > There is no caching, the RecordBatchReader yields a fully materialized > > > batch, it is up to the caller to decide how to handle the lifetime of > > > such batch. > > > > > > Long short story, > > > - it seems that you want lazy materialization via mmap to control the > > > active memory usage. This is not going to work with compression. > > > - if you use the ReadTable interface (instead of a stream reader) of > > > the reader, you get a _fully_ materialized table, i.e. each > > > RecordBatch is decompressed. > > > > > > The feather public API loads the whole table, you will need to work > > > with the IPC interface if you want to do stream reading. > > > > > > François > > > > > > [1] https://github.com/apache/arrow/tree/master/cpp/src/arrow/ipc > > > [2] > https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/writer.h#L65-L90 > > > [3] > https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/writer.cc#L113-L255 > > > [4] > https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/options.h#L85 >