For context, there was some discussion on this back in [1]. At that time this was called "sequence view" but I do not like that name. However, array-view array is a little confusing. Given this is similar to list can we go with list-view array?
> Thanks for the introduction. I'd be interested to hear about the > applications Velox has found for these vectors, and in what situations they > are useful. This could be contrasted with the current ListArray > implementations. I believe one significant benefit is that take (and by proxy, filter) and sort are O(# of items) with the proposed format and O(# of bytes) with the current format. Jorge did some profiling to this effect in [1]. [1] https://lists.apache.org/thread/49qzofswg1r5z7zh39pjvd1m2ggz2kdq On Tue, Apr 25, 2023 at 3:13 PM Will Jones <will.jones...@gmail.com> wrote: > Hi Felipe, > > Thanks for the introduction. I'd be interested to hear about the > applications Velox has found for these vectors, and in what situations they > are useful. This could be contrasted with the current ListArray > implementations. > > IIUC it would be fairly cheap to transform a ListArray to an ArrayView, but > expensive to go the other way. > > Best, > > Will Jones > > On Tue, Apr 25, 2023 at 3:00 PM Felipe Oliveira Carvalho < > felipe...@gmail.com> wrote: > > > Hi folks, > > > > I would like to start a public discussion on the inclusion of a new array > > format to Arrow — array-view array. The name is also up for debate. > > > > This format is inspired by Velox's ArrayVector format [1]. Logically, > this > > array represents an array of arrays. Each element is an array-view > (offset > > and size pair) that points to a range within a nested "values" array > > (called "elements" in Velox docs). The nested array can be of any type, > > which makes this format very flexible and powerful. > > > > [image: ../_images/array-vector.png] > > <https://facebookincubator.github.io/velox/_images/array-vector.png> > > > > I'm currently working on a C++ implementation and plan to work on a Go > > implementation to fulfill the two-implementations requirement for format > > changes. > > > > The draft design: > > > > - 3 buffers: [validity_bitmap, int32 offsets buffer, int32 sizes buffer] > > - 1 child array: "values" as an array of the type parameter > > > > validity_bitmap is used to differentiate between empty array views > > (sizes[i] == 0) and NULL array views (validity_bitmap[i] == 0). > > > > When the validity_bitmap[i] is 0, both sizes and offsets are undefined > (as > > usual), and when sizes[i] == 0, offsets[i] is undefined. 0 is recommended > > if setting a value is not an issue to the system producing the arrays. > > > > offsets buffer is not required to be ordered and views don't have to be > > disjoint. > > > > [1] > > > https://facebookincubator.github.io/velox/develop/vectors.html#arrayvector > > > > Thanks, > > Felipe O. Carvalho > > >