Thanks Wes, makes sense. I appreciate that there are use cases where both
could be applicable.

In my example, the most applicable I can think of is unnesting a ListArray
column for a DataFrame (in the future C++ DataFrames API?) similar to the
tidyr unnest function. I don't believe the current implementation wouldn't
be able to align the flattened ListArray with the rest of the columns. I'll
see if there's something I can do on this end.

On Wed, Sep 25, 2019 at 6:27 PM Wes McKinney <wesmck...@gmail.com> wrote:

> hi Suhail,
>
> This follows the columnar format closely. The List layout is composed
> from a child array providing the "inner" values, which are given the
> List<T> interpretation by adding an offsets buffer, and a validity
> buffer to distinguish null from 0-length list values. So flatten()
> here just returns the child array, which has only 3 values in the
> example you gave.
>
> A function could be written to insert "null" for List values that are
> null, but someone would have to write it and give it a name =)
>
> - Wes
>
> On Wed, Sep 25, 2019 at 5:15 PM Suhail Razzak <suhail.raz...@gmail.com>
> wrote:
> >
> > Hi,
> >
> > I'm working through a certain use case where I'm unnesting ListArrays,
> but
> > I noticed something peculiar - null ListValues are not retained in the
> > unnested array.
> >
> > E.g.
> > In [0]: arr = pa.array([[0, 1], [0], None, None])
> > In [1]: arr.flatten()
> > Out [1]: [0, 1, 0]
> >
> > While I would have expected [0, 1, 0, null, null].
> >
> > I should note that this works if the None is encapsulated in a list. So
> I'm
> > guessing this is expected logic and if so, what's the reasoning for that?
> >
> > Thanks,
> > Suhail
>

Reply via email to