I would be very happy to see GLib/Ruby bindings! I'm curious if you have a particular use case in mind.
There's a little bit more API cleanup to do [1]. If you have comments on that or anything else, I'd appreciate them. Otherwise, pull requests would also be appreciated. [1]: https://github.com/apache/arrow-adbc/issues/79 On Fri, Aug 26, 2022, at 21:53, Sutou Kouhei wrote: > Hi, > > Thanks for sharing the current status! > I understand. > > BTW, can I add GLib/Ruby bindings to apache/arrow-adbc > before we release the first version? (I want to use ADBC > from Ruby.) Or should I wait for the first release? If I can > work on it now, I'll open pull requests for it. > > Thanks, > -- > kou > > In <8703efd9-51bd-4f91-b550-73830667d...@www.fastmail.com> > "Re: [DISC] Improving Arrow's database support" on Fri, 26 Aug 2022 > 11:03:26 -0400, > "David Li" <lidav...@apache.org> wrote: > >> Thank you Kou! >> >> At least initially, I don't think I'll be able to complete the Dataset >> integration in time. So 10.0.0 probably won't ship with a hard dependency. >> That said I am hoping to have PyArrow take an optional dependency (so Flight >> SQL can finally be available from Python). >> >> On Fri, Aug 26, 2022, at 01:01, Sutou Kouhei wrote: >>> Hi, >>> >>> As a maintainer of Linux packages, I want apache/arrow-adbc >>> to be released before apache/arrow is released so that >>> apache/arrow's .deb/.rpm can depend on apache/arrow-adbc's >>> .deb/.rpm. >>> >>> (If Apache Arrow Dataset uses apache/arrow-adbc, >>> apache/arrow's .deb/.rpm needs to depend on >>> apache/arrow-adbc's .deb/.rpm.) >>> >>> We can add .deb/.rpm related files >>> (dev/tasks/linux-packages/ in apache/arrow) to >>> apache/arrow-adbc to build .deb/.rpm for apache/arrow-adbc. >>> >>> FYI: I did it for datafusion-contrib/datafusion-c: >>> >>> * https://github.com/datafusion-contrib/datafusion-c/tree/main/package >>> * >>> https://github.com/datafusion-contrib/datafusion-c/blob/main/.github/workflows/package.yaml >>> >>> I can work on it in apache/arrow-adbc. >>> >>> >>> Thanks, >>> -- >>> kou >>> >>> In <5cbf2923-4fb4-4c5e-b11d-007209fdd...@www.fastmail.com> >>> "Re: [DISC] Improving Arrow's database support" on Thu, 25 Aug 2022 >>> 11:51:08 -0400, >>> "David Li" <lidav...@apache.org> wrote: >>> >>>> Fair enough, thank you. I'll try to expand a bit. (Sorry for the wall of >>>> text that follows…) >>>> >>>> These are the components: >>>> >>>> - Core adbc.h header >>>> - Driver manager for C/C++ >>>> - Flight SQL-based driver >>>> - Postgres-based driver (WIP) >>>> - SQLite-based driver (more of a testbed for me than an actual component - >>>> I don't think we'd actually distribute this) >>>> - Java core interfaces >>>> - Java driver manager >>>> - Java JDBC-based driver >>>> - Java Flight SQL-based driver >>>> - Python driver manager >>>> >>>> I think: adbc.h gets mirrored into the Arrow repo. The Flight SQL drivers >>>> get moved to the main Arrow repo and distributed as part of the regular >>>> Arrow releases. >>>> >>>> For the rest of the components: they could be packaged individually, but >>>> versioned and released together. Also, each C/C++ driver probably needs a >>>> corresponding Python package so Python users do not have to futz with >>>> shared library configurations. (See [1].) So for instance, installing >>>> PyArrow would also give you the Flight SQL driver, and `pip install >>>> adbc_postgres` would get you the Postgres-based driver. >>>> >>>> That would mean setting up separate CI, release, etc. (and eventually >>>> linking Crossbow & Conbench as well?). That does mean duplication of >>>> effort, but the trade off is avoiding bloating the main release process >>>> even further. However, I'd like to hear from those closer to the release >>>> process on this subject - if it would make people's lives easier, we could >>>> merge everything into one repo/process. >>>> >>>> Integrations would be distributed as part of their respective packages >>>> (e.g. Arrow Dataset would optionally link to the driver manager). So the >>>> "part of Arrow 10.0.0" aspect means having a stable interface for adbc.h, >>>> and getting the Flight SQL drivers into the main repo. >>>> >>>> [1]: https://github.com/apache/arrow-adbc/issues/53 >>>> >>>> On Thu, Aug 25, 2022, at 11:34, Antoine Pitrou wrote: >>>>> On Fri, 19 Aug 2022 14:09:44 -0400 >>>>> "David Li" <lidav...@apache.org> wrote: >>>>>> Since it's been a while, I'd like to give an update. There are also a >>>>>> few questions I have around distribution. >>>>>> >>>>>> Currently: >>>>>> - Supported in C, Java, and Python. >>>>>> - For C/Python, there are basic drivers wrapping Flight SQL and SQLite, >>>>>> with a draft of a libpq (Postgres) driver (using nanoarrow). >>>>>> - For Java, there are drivers wrapping JDBC and Flight SQL. >>>>>> - For Python, there's low-level bindings to the C API, and the DBAPI >>>>>> interface on top of that (+a few extension methods resembling >>>>>> DuckDB/Turbodbc). >>>>>> >>>>>> There's drafts of integration with Ibis [1], DBI (R), and DuckDB. (I'd >>>>>> like to thank Hannes and Kirill for their comments, as well as Antoine, >>>>>> Dewey, and Matt here.) >>>>>> >>>>>> I'd like to have this as part of 10.0.0 in some fashion. However, I'm >>>>>> not sure how we would like to handle packaging and distribution. In >>>>>> particular, there are several sub-components for each language (the >>>>>> driver manager + the drivers), increasing the work. Any thoughts here? >>>>> >>>>> Sorry, forgot to answer here. But I think your question is too broadly >>>>> formulated. It probably deserves a case-by-case discussion, IMHO. >>>>> >>>>>> I'm also wondering how we want to handle this in terms of specification >>>>>> - I assume we'd consider the core header file/Java interfaces a spec >>>>>> like the C Data Interface/Flight RPC, and vote on them/mirror them into >>>>>> the format/ directory? >>>>> >>>>> That sounds like the right way to me indeed. >>>>> >>>>> Regards >>>>> >>>>> Antoine.