Re: NullVector getField() confusion

Micah Kornfield Fri, 22 Jan 2021 20:07:35 -0800

This sounds like a bug from the an interop perspective.  Record batches
including those with null columns, I think should be round trippable (I'm
surprised this isn't covered in our integration testing).


On Mon, Jan 11, 2021 at 8:44 AM Matt Boughen <[email protected]> wrote:

> Hi
>
> I'm confused by NullVector::getField.
> It is a FieldVector so, according to the javadoc, should be "A vector
> corresponding to a Field in the schema."
> However, getField unconditionally returns a new field - with the name
> "$data$" (which is clearly intentional:
> https://github.com/apache/arrow/pull/1193). It does not return the field
> in the schema that was used to create the vector.
> (there is some indication in that PR that NullVector/ZeroVector is only
> ever supposed to be an inner vector?)
> I have noticed this because for some types, pyarrow serialises fields of
> an empty data table to null-type Arrow fields (
> https://github.com/apache/arrow/issues/2110). When deserializing in Java,
> the original schema field names are not contained in the names of
> VectorSchemaRoot::getFieldVectors (and instead the "$data$" names are
> exposed).
>
> Is this a bug?
>
> Best
> Matt
>
>
>
> (p.s. I emailed [email protected] <mailto:[email protected]>
> using the web GUI but it still hasn’t shown up, so am trying this instead)

Re: NullVector getField() confusion

Reply via email to