I've set up the new repo and enabled issues. I still need to get things building independently of Arrow, but now adbc.h is self-contained and the "driver manager" being prototyped can also be built and used independently of Arrow.
On Wed, Jun 1, 2022, at 13:55, David Li wrote: > Wes: thanks! I'll move things over and update the list. > > Gavin: I mean more that ADBC won't support every little feature in > JDBC/ODBC, or won't necessarily make it easy to support certain things > (e.g. updating a single row in a ResultSet). But it's not that OLTP is > taboo, it's just not what is being optimized for. > > For instance it would be nice to eventually have JDBC/ODBC drivers that > can wrap ADBC in much the same way that Dremio is working on a JDBC > driver for Flight SQL. But especially in the near term, ADBC just won't > have the feature set to make that possible. > > What sorts of use cases were you thinking about, though? > > On Wed, Jun 1, 2022, at 13:18, Gavin Ray wrote: >> This sounds great, but I had one question: >> >> Read the initial ADBC proposal and it mentioned that OLTP was not a >> targeted usecase >> If this work is intended to take on the role of a sort of standard ABI/SDK, >> does that mean that building OLTP-oriented drivers/tooling with it is off >> the table? >> >> On Wed, Jun 1, 2022 at 11:11 AM Wes McKinney <wesmck...@gmail.com> wrote: >> >>> I went ahead and created >>> >>> https://github.com/apache/arrow-adbc >>> >>> I directed issue comments / PRs to issues@ >>> >>> On Tue, May 31, 2022 at 8:49 PM Wes McKinney <wesmck...@gmail.com> wrote: >>> > >>> > I think spinning up a new repository while this exploratory work >>> > progresses is a fine idea — perhaps apache/arrow-dbc / arrow-adbc or >>> > similar (the name can always be changed later). That would bubble up >>> > discussions in a way that's easier for people to follow (watching your >>> > fork isn't ideal!). If it makes sense to move code later, it can >>> > always be moved. >>> > >>> > >>> > On Tue, May 31, 2022 at 1:02 PM David Li <lidav...@apache.org> wrote: >>> > > >>> > > Some updates: >>> > > >>> > > The proposal is being updated based on feedback from contributors to >>> DuckDB and DBI. We've been using GitHub issues on the fork to discuss the >>> API design and how to implement data ingestion/bound parameters: >>> https://github.com/lidavidm/arrow/issues >>> > > >>> > > If anyone has suggestions/ideas/questions, or would like to jump in as >>> well, please feel free to chime in there too. >>> > > >>> > > I have also been wondering if we might want to plan to split off a new >>> repo for this work? In particular, some components might be easiest to >>> consume if they didn't also have a hard dependency on the Arrow C++ >>> libraries. And we could use the repo to manage contributed drivers (some of >>> which may individually leverage the Arrow libraries). Of course, >>> maintaining a parallel build system, setting up releases, etc. is also a >>> lot of work. >>> > > >>> > > -David >>> > > >>> > > On Tue, Apr 26, 2022, at 15:01, Wes McKinney wrote: >>> > > > I don't have major new things to add on this topic except that I've >>> > > > long had the aspiration of creating something like Python's DBAPI 2.0 >>> > > > [1] at the C or C++ level to enable a measure of API standardization >>> > > > for Arrow-native read/write interfaces with database drivers. It >>> seems >>> > > > like a natural complement to the wire-protocol standardization work >>> > > > with FlightSQL. I had previously brought in some code that I had >>> > > > worked on related to interfacing with the HiveServer2 wire protocol >>> > > > (for Hive and Impala, or other HS2-compatible query engines) with the >>> > > > intention of prototyping but never was able to find the time. >>> > > > >>> > > > From an external messaging standpoint, one thing that will be >>> > > > important is to assert that this is not intended to displace or >>> > > > deprecate ODBC or JDBC drivers. In fact, I would hope that the >>> > > > Arrow-native APIs could be added somehow to existing driver libraries >>> > > > where it made sense, so that if they are used in an application that >>> > > > uses Arrow, they can opt in to using the Arrow-based APIs for getting >>> > > > result sets, or doing bulk inserts, etc. >>> > > > >>> > > > [1]: https://peps.python.org/pep-0249/ >>> > > > >>> > > > On Tue, Apr 26, 2022 at 12:36 PM Antoine Pitrou <anto...@python.org> >>> wrote: >>> > > >> >>> > > >> >>> > > >> Do we want something more flexible than dlopen() and runtime symbol >>> > > >> lookup (a mechanism which constrains the way you can organize and >>> > > >> distribute drivers)? >>> > > >> >>> > > >> For example, perhaps we could expose an API struct of function >>> pointers >>> > > >> that could be obtained through driver-specific means. >>> > > >> >>> > > >> >>> > > >> Le 26/04/2022 à 18:29, David Li a écrit : >>> > > >> > Hello, >>> > > >> > >>> > > >> > In light of recent efforts around Flight SQL, projects like pgeon >>> [1], and long-standing tickets/discussions about database support in Arrow >>> [2], it seems there's an opportunity to define standard database interfaces >>> for Arrow that could unify these efforts. So we've put together a proposal >>> for "ADBC", a common Arrow-based database client API: >>> > > >> > >>> > > >> > >>> https://docs.google.com/document/d/1t7NrC76SyxL_OffATmjzZs2xcj1owdUsIF2WKL_Zw1U/edit#heading=h.r6o6j2navi4c >>> > > >> > >>> > > >> > A common API and implementations could help combine/simplify >>> client-side projects like pgeon, or what DBI is considering [3], and help >>> them take advantage of developments like Flight SQL and existing columnar >>> APIs. >>> > > >> > >>> > > >> > We'd appreciate any feedback. (Comments should be open, please >>> let me know if not.) >>> > > >> > >>> > > >> > [1]: https://github.com/0x0L/pgeon >>> > > >> > [2]: https://issues.apache.org/jira/browse/ARROW-11670 >>> > > >> > [3]: https://github.com/r-dbi/dbi3/issues/48 >>> > > >> > >>> > > >> > Thanks, >>> > > >> > David >>>