I don't think a second implementation is strictly necessary because this
is just defining a schema and some conventions around it. Though of
course a second implementation is always better to have.
Regards
Antoine.
Le 05/12/2024 à 17:47, Matt Topol a écrit :
* I implemented this proposal only in C++. The
implementation is already merged into apache/arrow. Should
we have one more implementation like format specification
change?
http://crossbow.voltrondata.com/pr_docs/43553/format/Changing.html#at-least-two-reference-implementations
Sorry to be that guy, but I would prefer having one more implementation as
this qualifies as a format change IMHO. So I'm +0 (binding) on this without
a second implementation. I won't oppose it getting merged, but that would
be my preference.
On Thu, Dec 5, 2024 at 12:48 AM Gang Wu <ust...@gmail.com> wrote:
+1 (binding)
I've left a minor comment to solicit concrete examples of data
in the statistics array if this is reasonable.
Best,
Gang
On Thu, Dec 5, 2024 at 11:17 AM wish maple <maplewish...@gmail.com> wrote:
+1 (non-binding)
Best,
Xuwei Fu
Sutou Kouhei <k...@clear-code.com> 于2024年12月5日周四 10:58写道:
Hi,
I would like to propose standardizing how to pas statistics
through the C data interface.
Motivation:
* We want to pass not only Apache Arrow data but also
statistics of them through the C data interface for query
planning.
Approach:
* Define a standardized schema for statistics.
* Represent statistics as an Apache Arrow array that uses
the schema.
* Pass the statistics Apache Arrow array through the C data
interface like a normal Apache Arrow array.
Note that we don't define a new interface for statistics. We
just use the existing C data interface. A statistics Apache
Arrow array is passed through a separated API call.
See also:
* The discussion of this:
https://lists.apache.org/thread/z0jz2bnv61j7c6lbk7lympdrs49f69cx
* The PR of this proposal that includes the statistics
schema definition:
https://github.com/apache/arrow/pull/43553
* The preview URL of the PR:
http://crossbow.voltrondata.com/pr_docs/43553/format/CDataInterfaceStatistics.html
Note:
* I implemented this proposal only in C++. The
implementation is already merged into apache/arrow. Should
we have one more implementation like format specification
change?
http://crossbow.voltrondata.com/pr_docs/43553/format/Changing.html#at-least-two-reference-implementations
The vote will be open for at least 72 hours.
[ ] +1 Accept this proposal
[ ] +0
[ ] -1 Do not accept this proposal because...
Thanks,
--
kou