>
> AFAIK IPC is just bytes. The alignment is done when they are copied over to
> allocated memory regions.


Agreed, that if implementations are copying then this isn't a concern.  The
IPC and File Formats were designed for memory mapping/zero copy.  So there
is an assumption that kernel pages meet the alignment requirements but
otherwise a copy should not be strictly necessary.

Just wasn't sure if it would be breaking implicit
> assumptions by consumers somewhere if they happened to get an IPC stream w/
> record batches that mixed, for example, 8-byte and 64-byte alignments.


I'm not aware of any assumptions here. Simply given the fact that there
isn't a mandate (and due to things like slicing, sharing buffers via the
C-ABI etc), I think all code handling Arrow arrays that wants to optimize
for an alignment still needs to verify alignment requirements on a buffer
by buffer basis.



On Wed, Apr 7, 2021 at 3:23 AM Jorge Cardoso Leitão <
jorgecarlei...@gmail.com> wrote:

> Hi Jacob,
>
> AFAIK IPC is just bytes. The alignment is done when they are copied over to
> allocated memory regions. It is the implementations' responsibility to
> allocate memory regions that are aligned depending on how those bytes
> should be interpreted (e.g. u64 vs u8). This interpretation is induced by
> the relationship between the logical types (e.g. Time32) and its
> corresponding physical types (e.g. 0th buffer is u8, 1st is i32). In this
> sense, afaik IPC does not need to declare byte alignment as they are
> inferred by the corresponding logical type.
>
> Best,
> Jorge
>
>
>
>
> On Wed, Apr 7, 2021 at 7:40 AM Jacob Quinn <quinn.jac...@gmail.com> wrote:
>
> > As far as I can tell, the alignment padding used in an IPC stream/file
> > isn't stored explicitly, and not really "inferrable", though maybe
> > technically possible if you calculated what bytes are *necessary* given a
> > buffer's data vs. what's actually stored.
> >
> > Just wondering if this has been brought up at all to store explicitly; it
> > came up in the Julia implementation when considering "appending" record
> > batches to an IPC stream that has already been written to disk; we
> > originally thought we would need to match alignment used in previously
> > written record batches, but upon further reflection, it seems like
> > technically it wouldn't matter since all buffers have the exact byte
> counts
> > written anyway. Just wasn't sure if it would be breaking implicit
> > assumptions by consumers somewhere if they happened to get an IPC stream
> w/
> > record batches that mixed, for example, 8-byte and 64-byte alignments.
> >
> > -Jacob
> >
>

Reply via email to