Having become familiar with the Arrow memory layout, and taking a stab at
an implementation in the Julia language, I've come up with a perhaps naive
question.

A "type" (class) I have defined so far is:

immutable Column{T} <: ArrowColumn{T}
    buffer::Vector{UInt8} # potential reference to mmap
    length::Int32
    null_count::Int32
    nulls::BitVector # null == 0 == false, not-null == 1 == true; always
padded to 64-byte alignments
    values::Vector{T} # always padded to 64-byte alignments
end


which aims to be an array/column that holds any "primitive" bits type `T`.
Note the exact layout matching with "length", "null_count", "nulls", and
"values".

The additional reference, however, is the "buffer" field, which holds a
reference to a byte buffer. This would be technically optional if the
`nulls` and `values` fields owned their own memory, but there are other
cases where `buffer` would own, for example, memory-mapped bytes that
`nulls` and `values` would be sharing.

My question is if this somehow "violates" the Arrow memory layout by
including this additional `buffer` reference in my class?

It begs a larger question of what exactly the inter-language "API" looks
like. I'm assuming it's not as strict as needing to be able to pass a
pointer to another process that would be able to auto-wrap as it's own
Arrow structure; but I think I read somewhere that it IS aiming for some
kind of "memcpy" operation. Any light anyone can shed would be most
welcome; help me know if I'm perhaps over-thinking this at this stage.

-Jacob

Reply via email to