+1 (binding)
I've tested successfully on Ubuntu 22.04 without R.
TEST_R=0 ./verify-release-candidate.sh 0.5.0 0
Regards,
Raúl
El jue, 23 may 2024 a las 6:49, David Li () escribió:
>
> +1 (binding)
>
> Tested on Debian 12 'bookworm'
>
> On Thu, May 23, 2024, at 11:03, Sutou Kouhei wrote:
> > +1
Thank you for the background! I understand that these statistics are
important for query planning; however, I am not sure that I follow why
we are constrained to the ArrowSchema to represent them. The examples
given seem to going through Python...would it be easier to request
statistics at a higher
The adbcdrivermanager, adbcsqlite, and adbcpostgresql packages are all
updated on CRAN!
On Tue, May 21, 2024 at 10:41 PM David Li wrote:
>
> [x] Close the GitHub milestone/project
> [x] Add the new release to the Apache Reporter System
> [x] Upload source release artifacts to Subversion
> [x] Cre
> would it be easier to request statistics at a higher level of
abstraction?
What if there were a "single table provider" level of abstraction between
ADBC and ArrowArrayStream as a C API; something that can report statistics
and apply simple predicates?
On Thu, May 23, 2024 at 5:57 AM Dewey Dun
I want to +1 on what Dewey is saying here and some comments.
Sutou Kouhei wrote:
> ADBC may be a bit larger to use only for transmitting statistics. ADBC has
> statistics related APIs but it has more other APIs.
It's impossible to keep the responsibility of communication protocols
cleanly separa
Le 23/05/2024 à 16:09, Felipe Oliveira Carvalho a écrit :
Protocols that produce/consume statistics might want to use the C Data
Interface as a primitive for passing Arrow arrays of statistics.
This is also my opinion.
I think what we are slowly converging on is the need for a spec to
desc
This is a really exciting development, thank you for putting together this
proposal!
It looks like this thread and the linked GitHub issue has lots of input from
folks who work with Arrow at a low level and have better familiarity with the
Arrow specifications than I do, so I'll refrain from co
Hi Shoumyo,
The problem with communicating data statistics through schema metadata
is that it's not compatible with use cases where you want to know the
schema *before* the data is produced.
Regards
Antoine.
On Thu, 23 May 2024 14:28:43 -
"Shoumyo Chakravorti (BLOOMBERG/ 120 PARK)"
wrot
Thanks Shoumyo for bringing this up!
Using a schema to transmit statistica/data dependent values is also
something we do in GeoParquet (whose schema also finds its way into
pyarrow and the C data interface when reading). It is usually fine but
occasionally ends up with schema metadata that is lyin
Hello,
I am seeing a deadlock when destructing an ObjectOutputStream. I have
attached the stack trace.
I did some debugging and found that the issue seems to be that the mutex in
question is already held by this thread (I checked the __owner field in the
pthread_mutex_t which points to the hangin
Appreciate the additional context!
> use cases where you want to know the schema *before*
> the data is produced
I think my understanding aligns with Dewey's on this point.
I guess I'm struggling to imagine a scenario where a query
planner would want the schema but not the statistics. Because
by
For what it's worth, duckdb accesses arrow data via IPC in an extension then
exports to C data interface to call into code in its core.
Also, assumptions about when query optimization occurs relative to data access
potentially breaks down in scenarios involving: views, distributed tables,
substr
+1 (non-binding)
I have tested on Ubuntu 22.04
./verify-release-candidate.sh 0.5.0 0
With Regards,
Vibhatha Abeykoon
On Thu, May 23, 2024 at 3:21 PM Raúl Cumplido wrote:
> +1 (binding)
>
> I've tested successfully on Ubuntu 22.04 without R.
>
> TEST_R=0 ./verify-release-candidate.sh 0.5.0 0
13 matches
Mail list logo