Hi there, Happy new year.
I store some data in arrow IPC files. And I have two fields that are always accessed at the same time, namely, when accessing these two fields, they are accessed in a row oriented manner and are always fetched together, but other fields are accessed in columnar manner. One of the fields is a string field, and the other is an int32 field. I would like to know if there is any canonical approach for modeling this kind of usage in arrow. The IPC files are memory mapped, and are randomly accessed. Because of the columnar storage, when accessing the two fields of the same row, it requires 2 random accesses to do it. Since I know the access pattern for these two fields is always reading together, theoretically it can be reduced to 1 random access when fetching them. Initially I read doc about struct layout ( https://arrow.apache.org/docs/format/Columnar.html#struct-layout), but it seems still storing and accessing the data in a columnar manner so it doesn't help. I could probably use some proprietary encoding to encode these two fields into a single field, but it is not elegant and somewhat less portable. Is there any canonical approach in arrow for modeling such usage? Thanks. Regards, Yue