I think the ArrayVector can have benefits above:
1. Converting a Batch in Velox or other system to arrow array could be much
    more lightweight.
2. Modifying, filter and copy array or string could be much more
lightweight

Velox can make a Vector mutable, seems that arrow array cannot. Seems it
makes little difference here.

On 2023/04/25 22:00:08 Felipe Oliveira Carvalho wrote:
> Hi folks,
>
> I would like to start a public discussion on the inclusion of a new array
> format to Arrow — array-view array. The name is also up for debate.
>
> This format is inspired by Velox's ArrayVector format [1]. Logically, this
> array represents an array of arrays. Each element is an array-view (offset
> and size pair) that points to a range within a nested "values" array
> (called "elements" in Velox docs). The nested array can be of any type,
> which makes this format very flexible and powerful.
>
> [image: ../_images/array-vector.png]
> <https://facebookincubator.github.io/velox/_images/array-vector.png>
>
> I'm currently working on a C++ implementation and plan to work on a Go
> implementation to fulfill the two-implementations requirement for format
> changes.
>
> The draft design:
>
> - 3 buffers: [validity_bitmap, int32 offsets buffer, int32 sizes buffer]
> - 1 child array: "values" as an array of the type parameter
>
> validity_bitmap is used to differentiate between empty array views
> (sizes[i] == 0) and NULL array views (validity_bitmap[i] == 0).
>
> When the validity_bitmap[i] is 0, both sizes and offsets are undefined (as
> usual), and when sizes[i] == 0, offsets[i] is undefined. 0 is recommended
> if setting a value is not an issue to the system producing the arrays.
>
> offsets buffer is not required to be ordered and views don't have to be
> disjoint.
>
> [1]
> https://facebookincubator.github.io/velox/develop/vectors.html#arrayvector
>
> Thanks,
> Felipe O. Carvalho
>

Reply via email to