+1

On Mon, Dec 23, 2024 at 2:37 AM Sutou Kouhei <k...@clear-code.com> wrote:

> Hi,
>
> I would like to propose standardizing how to represent
> statistics as Apache Arrow array.
>
> Motivation:
>
> * We want to pass not only Apache Arrow data but also
>   statistics of them through the C data interface for query
>   planning.
>
> Approach:
>
> * Define a standardized schema for statistics.
> * Represent statistics as an Apache Arrow array that uses
>   the schema.
> * Pass the statistics Apache Arrow array through the C data
>   interface like a normal Apache Arrow array.
>
> Note that we don't define a new interface for statistics. We
> just use the existing C data interface. A statistics Apache
> Arrow array is passed through a separated API call.
>
> Note that this proposal doesn't define anything about how or
> where to use it. The above example just shows one use-case.
>
> This is based on the previous rejected vote discussion:
> https://lists.apache.org/thread/rsw3wsyj68dksc98s5rpdp6dn8hfk0yd
>
> See also:
>
> * The discussion of this:
>   https://lists.apache.org/thread/b6chzlyn95rztoybs39b6olz907g12gj
> * The PR of this proposal:
>   https://github.com/apache/arrow/pull/45058
> * The preview URL of the PR:
>
> http://crossbow.voltrondata.com/pr_docs/45058/format/StatisticsSchema.html
>
>
> The vote will be open for at least 72 hours.
>
> [ ] +1 Accept this proposal
> [ ] +0
> [ ] -1 Do not accept this proposal because...
>
>
> Thanks,
> --
> kou
>

Reply via email to