+1 (binding) I've left a minor comment to solicit concrete examples of data in the statistics array if this is reasonable.
Best, Gang On Thu, Dec 5, 2024 at 11:17 AM wish maple <maplewish...@gmail.com> wrote: > +1 (non-binding) > > Best, > Xuwei Fu > > Sutou Kouhei <k...@clear-code.com> 于2024年12月5日周四 10:58写道: > > > Hi, > > > > I would like to propose standardizing how to pas statistics > > through the C data interface. > > > > Motivation: > > > > * We want to pass not only Apache Arrow data but also > > statistics of them through the C data interface for query > > planning. > > > > Approach: > > > > * Define a standardized schema for statistics. > > * Represent statistics as an Apache Arrow array that uses > > the schema. > > * Pass the statistics Apache Arrow array through the C data > > interface like a normal Apache Arrow array. > > > > Note that we don't define a new interface for statistics. We > > just use the existing C data interface. A statistics Apache > > Arrow array is passed through a separated API call. > > > > See also: > > > > * The discussion of this: > > https://lists.apache.org/thread/z0jz2bnv61j7c6lbk7lympdrs49f69cx > > * The PR of this proposal that includes the statistics > > schema definition: > > https://github.com/apache/arrow/pull/43553 > > * The preview URL of the PR: > > > > > http://crossbow.voltrondata.com/pr_docs/43553/format/CDataInterfaceStatistics.html > > > > Note: > > > > * I implemented this proposal only in C++. The > > implementation is already merged into apache/arrow. Should > > we have one more implementation like format specification > > change? > > > > > http://crossbow.voltrondata.com/pr_docs/43553/format/Changing.html#at-least-two-reference-implementations > > > > > > The vote will be open for at least 72 hours. > > > > [ ] +1 Accept this proposal > > [ ] +0 > > [ ] -1 Do not accept this proposal because... > > > > > > Thanks, > > -- > > kou > > >