+1 (binding) In <20241223.143545.907946641732263753....@clear-code.com> "[VOTE] Apache Arrow array representation of statistics" on Mon, 23 Dec 2024 14:35:45 +0900 (JST), Sutou Kouhei <k...@clear-code.com> wrote:
> Hi, > > I would like to propose standardizing how to represent > statistics as Apache Arrow array. > > Motivation: > > * We want to pass not only Apache Arrow data but also > statistics of them through the C data interface for query > planning. > > Approach: > > * Define a standardized schema for statistics. > * Represent statistics as an Apache Arrow array that uses > the schema. > * Pass the statistics Apache Arrow array through the C data > interface like a normal Apache Arrow array. > > Note that we don't define a new interface for statistics. We > just use the existing C data interface. A statistics Apache > Arrow array is passed through a separated API call. > > Note that this proposal doesn't define anything about how or > where to use it. The above example just shows one use-case. > > This is based on the previous rejected vote discussion: > https://lists.apache.org/thread/rsw3wsyj68dksc98s5rpdp6dn8hfk0yd > > See also: > > * The discussion of this: > https://lists.apache.org/thread/b6chzlyn95rztoybs39b6olz907g12gj > * The PR of this proposal: > https://github.com/apache/arrow/pull/45058 > * The preview URL of the PR: > http://crossbow.voltrondata.com/pr_docs/45058/format/StatisticsSchema.html > > > The vote will be open for at least 72 hours. > > [ ] +1 Accept this proposal > [ ] +0 > [ ] -1 Do not accept this proposal because... > > > Thanks, > -- > kou