Re: [PyArrow] Arrow StructArray buffer allocation

2022-03-04 Thread Hanqi Wu
> On Mar 4, 2022, at 9:08 AM, Antoine Pitrou wrote: > > > I opened https://issues.apache.org/jira/browse/ARROW-15846 > Regards > > Antoine. > > > Le 04/03/2022 à 15:05, Antoine Pitrou a écrit : >> Le 04/03/2022 à 15:01, Hanqi Wu a écrit : >>> Hi Antoine, >>> >>> I agree n_buffers should s

Re: [PyArrow] Arrow StructArray buffer allocation

2022-03-04 Thread Antoine Pitrou
I opened https://issues.apache.org/jira/browse/ARROW-15846 Regards Antoine. Le 04/03/2022 à 15:05, Antoine Pitrou a écrit : Le 04/03/2022 à 15:01, Hanqi Wu a écrit : Hi Antoine, I agree n_buffers should still be set to 1. But as per the below PyArrow doc, n_buffers’s value will be 0 if

Re: [PyArrow] Arrow StructArray buffer allocation

2022-03-04 Thread Antoine Pitrou
Le 04/03/2022 à 15:01, Hanqi Wu a écrit : Hi Antoine, I agree n_buffers should still be set to 1. But as per the below PyArrow doc, n_buffers’s value will be 0 if no null values in a struct array. This is what confuses me. "A struct array does not have any additional allocated physical stor

Re: [PyArrow] Arrow StructArray buffer allocation

2022-03-04 Thread Hanqi Wu
Hi Antoine, I agree n_buffers should still be set to 1. But as per the below PyArrow doc, n_buffers’s value will be 0 if no null values in a struct array. This is what confuses me. "A struct array does not have any additional allocated physical storage for its values. A struct array must still

Re: [PyArrow] Arrow StructArray buffer allocation

2022-03-04 Thread Antoine Pitrou
Hi Hanqi, Le 04/03/2022 à 14:53, Hanqi Wu a écrit : Hi Antoine, I agree. But my question is for Arrow StructArray with No null values. In this case, as per the documentation, n_buffers should be set to 0. Well, no. As I said, it should still be 1. You can also take a look at the fields p

Re: [PyArrow] Arrow StructArray buffer allocation

2022-03-04 Thread Hanqi Wu
Hi Antoine, I agree. But my question is for Arrow StructArray with No null values. In this case, as per the documentation, n_buffers should be set to 0. However, “import_from_c” expects StructArray to always have at least 1 buffer allocated, otherwise it throws an exception. Best, Hanqi > On

Re: [PyArrow] Arrow StructArray buffer allocation

2022-03-04 Thread Antoine Pitrou
Le 04/03/2022 à 04:17, Hanqi Wu a écrit : Hello community, As per the below documentation, for an Arrow StructArray, it won’t have any physical buffers backing it if it doesn’t contain any null value: https://arrow.apache.org/docs/format/Columnar.html#struct-layout However, in PyArrow, it co