Hi,

While I'm generally in favor of accepting this soon, I'm -1 on accepting it right now because it seems the PR hasn't had enough review attention on it (I posted some comments).

A spec is an important document that will bind us for years, so let's make sure we write something that will not bother us in the future.

Regards

Antoine.


Le 05/12/2024 à 03:58, Sutou Kouhei a écrit :
Hi,

I would like to propose standardizing how to pas statistics
through the C data interface.

Motivation:

* We want to pass not only Apache Arrow data but also
   statistics of them through the C data interface for query
   planning.

Approach:

* Define a standardized schema for statistics.
* Represent statistics as an Apache Arrow array that uses
   the schema.
* Pass the statistics Apache Arrow array through the C data
   interface like a normal Apache Arrow array.

Note that we don't define a new interface for statistics. We
just use the existing C data interface. A statistics Apache
Arrow array is passed through a separated API call.

See also:

* The discussion of this:
   https://lists.apache.org/thread/z0jz2bnv61j7c6lbk7lympdrs49f69cx
* The PR of this proposal that includes the statistics
   schema definition:
   https://github.com/apache/arrow/pull/43553
* The preview URL of the PR:
   
http://crossbow.voltrondata.com/pr_docs/43553/format/CDataInterfaceStatistics.html

Note:

* I implemented this proposal only in C++. The
   implementation is already merged into apache/arrow. Should
   we have one more implementation like format specification
   change?
   
http://crossbow.voltrondata.com/pr_docs/43553/format/Changing.html#at-least-two-reference-implementations


The vote will be open for at least 72 hours.

[ ] +1 Accept this proposal
[ ] +0
[ ] -1 Do not accept this proposal because...


Thanks,

Reply via email to