I do still think that having a "packed C struct" type would be a useful thing, but thus far no one has needed it enough to develop something in the columnar format specification.
On Tue, Aug 31, 2021 at 1:33 AM Micah Kornfield <[email protected]> wrote: > > Hi Jorge, > Are there places in the docs that you think this would simplify? > There is an old JIRA [1] about introducing a c-struct type that I > think aligns with this observation [1] > > -Micah > > [1] https://issues.apache.org/jira/browse/ARROW-1790 > > On Mon, Aug 30, 2021 at 2:57 PM Jorge Cardoso Leitão > <[email protected]> wrote: > > > > Hi, > > > > Just came across this curiosity that IMO may help us to design physical > > types in the future. > > > > Not sure if this was mentioned before, but it seems to me that > > `DaysMilliseconds` and `MonthDayNano` belong to a broader class of physical > > types "typed tuples" in that they are constructed by defining the tuple > > `(t_1,t_2,...,t_N)` where t_i (e.g. int32) is representable in memory for a > > given endianess, and each element of the array is written to the buffer > > back to back as `<t1 in endianess><t2 in endianess>...<tN in endianess>`. > > > > Primitive arrays such as e.g. `Int32Array` are the extreme case where the > > tuple has a single entry (t1,), which leads to `<int32 in endianess>`. The > > others are: > > * DaysMilliseconds = (int32, int32) > > * MonthDayNano = (int32, int32, int64) > > > > In principle, we could re-write the in-memory layout page in these terms > > that places all the types above in the same "bucket". > > > > Best, > > Jorge
