Le 04/03/2022 à 04:17, Hanqi Wu a écrit :
Hello community,

As per the below documentation, for an Arrow StructArray, it won’t have any 
physical buffers backing it if it doesn’t contain any null value:

https://arrow.apache.org/docs/format/Columnar.html#struct-layout
However, in PyArrow, it complains if you try to import from C an ArrowArray 
representing Struct type without a null vector (no nulls), which, according to 
the Arrow spec above, is permitted.
To be more detailed, when doing import from C, it expects the number of buffers 
to be 1, as coded here:
https://github.com/apache/arrow/blob/8e43f23dcc6a9e630516228f110c48b64d13cec6/cpp/src/arrow/c/bridge.cc#L1332
Which seems to suggest it will always expect the validity bitmap.

Not really.  It expects one entry in the `buffers` array
(`n_buffers == 1`), but the entry can be NULL:

"""The pointer to the null bitmap buffer, if the data type specifies one, MAY be NULL only if ArrowArray.null_count is 0."""

https://arrow.apache.org/docs/format/CDataInterface.html#c.ArrowArray.buffers

You can only see the corresponding logic in the import code here:
https://github.com/apache/arrow/blob/8e43f23dcc6a9e630516228f110c48b64d13cec6/cpp/src/arrow/c/bridge.cc#L1423-L1431

Regards

Antoine.

Reply via email to