Thanks Wes, makes sense. I appreciate that there are use cases where both could be applicable.
In my example, the most applicable I can think of is unnesting a ListArray column for a DataFrame (in the future C++ DataFrames API?) similar to the tidyr unnest function. I don't believe the current implementation wouldn't be able to align the flattened ListArray with the rest of the columns. I'll see if there's something I can do on this end. On Wed, Sep 25, 2019 at 6:27 PM Wes McKinney <wesmck...@gmail.com> wrote: > hi Suhail, > > This follows the columnar format closely. The List layout is composed > from a child array providing the "inner" values, which are given the > List<T> interpretation by adding an offsets buffer, and a validity > buffer to distinguish null from 0-length list values. So flatten() > here just returns the child array, which has only 3 values in the > example you gave. > > A function could be written to insert "null" for List values that are > null, but someone would have to write it and give it a name =) > > - Wes > > On Wed, Sep 25, 2019 at 5:15 PM Suhail Razzak <suhail.raz...@gmail.com> > wrote: > > > > Hi, > > > > I'm working through a certain use case where I'm unnesting ListArrays, > but > > I noticed something peculiar - null ListValues are not retained in the > > unnested array. > > > > E.g. > > In [0]: arr = pa.array([[0, 1], [0], None, None]) > > In [1]: arr.flatten() > > Out [1]: [0, 1, 0] > > > > While I would have expected [0, 1, 0, null, null]. > > > > I should note that this works if the None is encapsulated in a list. So > I'm > > guessing this is expected logic and if so, what's the reasoning for that? > > > > Thanks, > > Suhail >