Hi,
> Why not simply pass the statistics ArrowArray separately in your
> producer API of choice
It seems that we should use the approach because all
feedback said so. How about the following schema for the
statistics ArrowArray? It's based on ADBC.
| Field Name | Field Type
Hi,
I agree with the proposed approach is a departure use of
ArrowSchema.
ADBC may be a bit larger to use only for transmitting
statistics. ADBC has statistics related APIs but it has more
other APIs.
> It is also not the first time it has come up to encode
> data-dependent information
Hi,
> One potential challenge with encoding statistics in the schema
> metadata is that some systems may consider this metadata as part of
> assessing schema equivalence.
It's a good point. I didn't notice it. The proposed approach
makes schemas different because they have addresses of
ArrowArray
+1 (binding)
Tested on Debian 12 'bookworm'
On Thu, May 23, 2024, at 11:03, Sutou Kouhei wrote:
> +1 (binding)
>
> I ran the following command line on Debian GNU/Linux sid:
>
> dev/release/verify-release-candidate.sh 0.5.0 0
>
> with:
>
> * Apache Arrow C++ main
> * gcc (Debian 13.2.0-23) 1
+1 (binding)
I ran the following command line on Debian GNU/Linux sid:
dev/release/verify-release-candidate.sh 0.5.0 0
with:
* Apache Arrow C++ main
* gcc (Debian 13.2.0-23) 13.2.0
* R version 4.3.3 (2024-02-29) -- "Angel Food Cake"
* Python 3.11.9
Thanks,
--
kou
In
"[VOTE] Rel
+1 (non-binding)
Verified on MacOS 14 aarch64.
On Wed, May 22, 2024 at 2:55 PM Bryce Mecum wrote:
> +1 (non-binding)
>
> Verified on:
>
> - macOS aarch64
> - Debian 12 x86_64 inside a conda environment (note I had to install
> Python 3.11 separately from the instructions, not sure I missed a
>
+1 (non-binding)
Verified on:
- macOS aarch64
- Debian 12 x86_64 inside a conda environment (note I had to install
Python 3.11 separately from the instructions, not sure I missed a
step)
On Wed, May 22, 2024 at 10:18 AM Dewey Dunnington
wrote:
>
> Hello,
>
> I would like to propose the followin
Hello,
I would like to propose the following release candidate (rc0) of
Apache Arrow nanoarrow [0] version 0.5.0. This is an initial release
consisting of 79 resolved GitHub issues from 9 contributors [1].
This release candidate is based on commit:
c5fb10035c17b598e6fd688ad9eb7b874c7c631b [2]
Th
Hi Kou,
I agree that Dewey that this is overstretching the capabilities of the C
Data Interface. In particular, stuffing a pointer as metadata value and
decreeing it immortal doesn't sound like a good design decision.
Why not simply pass the statistics ArrowArray separately in your
produce
I am definitely in favor of adding (or adopting an existing)
ABI-stable way to transmit statistics (the one that comes up most
frequently for me is just the number of values that are about to show
up in an ArrowArrayStream, since the producer often knows this and the
consumer often would like to pr
Hi,
One potential challenge with encoding statistics in the schema metadata
is that some systems may consider this metadata as part of assessing
schema equivalence.
However, I think the bigger question is what the intended use-case for
these statistics is? Often query engines want to collect
11 matches
Mail list logo