Re: [DISC] Improving Arrow's database support

2022-08-25 Thread Sutou Kouhei
Hi, As a maintainer of Linux packages, I want apache/arrow-adbc to be released before apache/arrow is released so that apache/arrow's .deb/.rpm can depend on apache/arrow-adbc's .deb/.rpm. (If Apache Arrow Dataset uses apache/arrow-adbc, apache/arrow's .deb/.rpm needs to depend on apache/arrow-ad

[RFC] Substrait for Flight SQL

2022-08-25 Thread David Li
It was pointed out to me that by using an old thread, people may not have realized there's actually a discussion here. So this is just one final call for comments on a proposal to add support for Substrait [1] to Flight SQL: https://github.com/apache/arrow/pull/13492 The proposal also adds supp

PRs for RLE support

2022-08-25 Thread Tobias Zagorni
Hello Everyone, Recently, I have implemented support for run-length encoding in Arrow C++. So far my implementation is split into different subtasks of ARROW-16771 (https://issues.apache.org/jira/browse/ARROW-16771). I have (draft) PRs available for: - general handling of RLE in arrow C++, Type,

Re: Proposal: A Table Data Structure for Arrow Java

2022-08-25 Thread Larry White
Hi all, Thank you, Antoine and everyone for the feedback. It's been very helpful. The proposal has been updated to incorporate suggested changes and clarify as needed. Several people have expressed support for the idea of using a Java version of ChunkedArrays as the internal representation. I'm w

Re: [DISC] Improving Arrow's database support

2022-08-25 Thread David Li
It currently does do dlopen()/LoadLibrary but based on how it's being used by Python I'm going to refactor that out separately so that the main method of usage will be to pass it a pointer to the driver-specific initialization function. It does not have any notion of internal registry. (And I'm

Re: [DISC] Improving Arrow's database support

2022-08-25 Thread Antoine Pitrou
Le 25/08/2022 à 18:19, David Li a écrit : Hmm, what is a driver manager exactly? Does it actually manage drivers (how so)? Is it more of a core library? It implements the ADBC API, but dynamically delegates to an actual implementation underneath, so that you do not have to directly link to t

Re: [DISC] Improving Arrow's database support

2022-08-25 Thread David Li
> Hmm, what is a driver manager exactly? Does it actually manage drivers > (how so)? Is it more of a core library? It implements the ADBC API, but dynamically delegates to an actual implementation underneath, so that you do not have to directly link to the driver, or to help deal with using mul

Re: [DISC] Improving Arrow's database support

2022-08-25 Thread Antoine Pitrou
Le 25/08/2022 à 17:51, David Li a écrit : Fair enough, thank you. I'll try to expand a bit. (Sorry for the wall of text that follows…) These are the components: - Core adbc.h header - Driver manager for C/C++ - Flight SQL-based driver - Postgres-based driver (WIP) - SQLite-based driver (more

Re: [DISC] Improving Arrow's database support

2022-08-25 Thread David Li
Fair enough, thank you. I'll try to expand a bit. (Sorry for the wall of text that follows…) These are the components: - Core adbc.h header - Driver manager for C/C++ - Flight SQL-based driver - Postgres-based driver (WIP) - SQLite-based driver (more of a testbed for me than an actual component

Re: [DISC] Improving Arrow's database support

2022-08-25 Thread Antoine Pitrou
On Fri, 19 Aug 2022 14:09:44 -0400 "David Li" wrote: > Since it's been a while, I'd like to give an update. There are also a few > questions I have around distribution. > > Currently: > - Supported in C, Java, and Python. > - For C/Python, there are basic drivers wrapping Flight SQL and SQLite,

Apache Software Foundation community survey 2022

2022-08-25 Thread Antoine Pitrou
(copied below is a message from the Apache Software Foundation) Hello everyone, The 2022 ASF Community Survey is looking to gather scientific data that allows us to understand our community better, both in its demographic composition, and also in collaboration styles and preferences. We wan

Re: [VOTE] Format: Rules and procedures for Canonical extension types

2022-08-25 Thread Antoine Pitrou
Le 25/08/2022 à 02:08, Weston Pace a écrit : +1 (non-binding). This is maybe implied but I would add that modification of extension types must also require a vote and should be backwards compatible. Furthermore, extension types (particularly those with extensive parameterization/serialization