Re: [VOTE] Adopt ADBC database client connectivity specification

Gavin Ray Thu, 22 Sep 2022 10:14:59 -0700

Antoine, I can't comment on the Go code (not qualified) but to me, the
"verification" test
examples look like a mixture between JDBC and Java FlightSQL driver usage,
and seem solid.


There was one reservation I had about the ability to handle datasource
namespacing that I brought up early on in the proposal discussions
(David responded to it but I got busy and forgot to reply again)

If you have a datasource which provides possibly arbitrary levels of schema
namespace (something like Apache Calcite, for example)
How do you represent the table/schema names?

Suppose I have a service with a DB layout like this:

/ foo
    / bar
        / baz
            /qux
              / table1
                - column1

At my dayjob, we have a technology which is very similar to ADBC/FlightSQL
(would be great to adopt Substrait + ADBC once they're mature enough)
-
https://github.com/hasura/graphql-engine/blob/master/dc-agents/README.md#data-connectors
-
https://techcrunch.com/2022/06/28/hasura-now-lets-developers-turn-any-data-source-into-a-graphql-api/

We wound up having to redesign the specification to handle datasources that
don't fit the "database-schema-table" or "database-table" mould

In the ADBC schema for schema metadata, it looks like it expects a single
"schema" struct:
https://github.com/apache/arrow-adbc/blob/7866a566f5b7b635267bfb7a87ea49b01dfe89fa/java/core/src/main/java/org/apache/arrow/adbc/core/StandardSchemas.java#L132-L152

If you want to be flexible, IMO it would be good to either:

1. Have DB_SCHEMA_SCHEMA be self-recursive, so that schemas (with or
without tables) can be nested arbitrarily deep underneath each other
      - Fully-Qualified-Table-Name (FQTN) can then be computed by walking
up from a table and concating the schema name until the root schema is
reached

2. Make "catalog" and "schema" go away entirely, and tables just have a
FQTN that is an array, a database is a collection of tables
     - You can compute what would have been the catalog + schema hierarchy
by doing a .reduce() over the list of tables and

Or maybe there is another, better way. But that's my $0.02 and the only
real concern about the API I have, without actually trying to build
something with it.





On Thu, Sep 22, 2022 at 5:40 AM Antoine Pitrou <anto...@python.org> wrote:

>
> Hello,
>
> I would urge people to review the proposed ADBC APIs, especially the Go
> and Java APIs which probably benefitted from less feedback than the C one.
>
> Regards
>
> Antoine.
>
>
> Le 21/09/2022 à 17:40, David Li a écrit :
> > Hello,
> >
> > We have been discussing [1] standard interfaces for Arrow-based database
> access and have been working on implementations of the proposed interfaces
> [2], all under the name "ADBC". This proposal aims to provide a unified
> client abstraction across Arrow-native database protocols (like Flight SQL)
> and non-Arrow database protocols, which can then be used by Arrow projects
> like Dataset/Acero and ecosystem projects like Ibis.
> >
> > For details, see the RFC here:
> https://github.com/apache/arrow/pull/14079
> >
> > I would like to propose that the Arrow project adopt this RFC, along
> with apache/arrow-adbc commit 7866a56 [3], as version 1.0.0 of the ADBC API
> standard.
> >
> > Please vote to adopt the specification as described above. (This is not
> a vote to release any components.)
> >
> > This vote will be open for at least 72 hours.
> >
> > [ ] +1 Adopt the ADBC specification
> > [ ]  0
> > [ ] -1 Do not adopt the specification because...
> >
> > Thanks to the DuckDB and R DBI projects for providing feedback on and
> implementations of the proposal.
> >
> > [1]: https://lists.apache.org/thread/cq7t9s5p7dw4vschylhwsfgqwkr5fmf2
> > [2]: https://github.com/apache/arrow-adbc
> > [3]:
> https://github.com/apache/arrow-adbc/commit/7866a566f5b7b635267bfb7a87ea49b01dfe89fa
> >
> > Thank you,
> > David
>

Re: [VOTE] Adopt ADBC database client connectivity specification

Reply via email to