+1 (binding)
Despite my earlier misgivings, I think this will be a valuable addition
to the specification.
To clarify I've interpreted this as a vote on both Utf8View and
BinaryView as in the linked PR.
On 28/06/2023 20:34, Benjamin Kietzman wrote:
Hello,
I'd like to propose adding Utf8View arrays to the arrow format.
Previous discussion in [1], columnar format description in [2],
flatbuffers changes in [3].
There are implementations available in both C++[4] and Go[5] which
exercise the new type over IPC. Utf8View format demonstrates[6]
significant performance benefits over Utf8 in common tasks.
The vote will be open for at least 72 hours.
[ ] +1 add the proposed Utf8View type to the Apache Arrow format
[ ] -1 do not add the proposed Utf8View type to the Apache Arrow format
because...
Sincerely,
Ben Kietzman
[1] https://lists.apache.org/thread/w88tpz76ox8h3rxkjl4so6rg3f1rv7wt
[2]
https://github.com/apache/arrow/blob/46cf7e67766f0646760acefa4d2d01cdfead2d5d/docs/source/format/Columnar.rst#variable-size-binary-view-layout
[3]
https://github.com/apache/arrow/pull/35628/files#diff-0623d567d0260222d5501b4e169141b5070eabc2ec09c3482da453a3346c5bf3
[4] https://github.com/apache/arrow/pull/35628
[5] https://github.com/apache/arrow/pull/35769
[6] https://github.com/apache/arrow/pull/35628#issuecomment-1583218617