It was pointed out to me that by using an old thread, people may not have realized there's actually a discussion here. So this is just one final call for comments on a proposal to add support for Substrait [1] to Flight SQL: https://github.com/apache/arrow/pull/13492
The proposal also adds support for explicit transactions with a view towards making it easier to support interoperability with standards like JDBC and ODBC. There's implementations for C++ and Java, as well as integration tests. So assuming no further comments, I plan to start a vote on Monday. [1]: https://substrait.io/ On Thu, Aug 18, 2022, at 14:01, David Li wrote: > I've updated the PR [1] and I believe everything is resolved. (I've > fixed ARROW-17254, and changed the Protobuf definition to work around > Protobuf's issues.) If there's no further comments, I'll start a vote > in the coming days. > > [1]: https://github.com/apache/arrow/pull/13492 > > Thanks, > David > > On Fri, Aug 5, 2022, at 14:54, David Li wrote: >> I've added implementations for Java and C++ to the draft [1], including >> integration tests, after addressing comments on the proposal itself >> (thanks all for the comments). >> >> One thing is, I might suggest punting on CancelQuery for now, or >> changing how it's implemented, since embedding a message from >> Flight.proto into FlightSql.proto interacts badly with Windows/DLLs >> (protoc has poor support for embedding dllimport/dllexport macros). >> >> Otherwise I think things are ready, though we'll want to fix >> ARROW-17254 [2] alongside it. >> >> [1]: https://github.com/apache/arrow/pull/13492 >> [2]: https://issues.apache.org/jira/browse/ARROW-17254 >> >> On Fri, Jul 1, 2022, at 14:34, David Li wrote: >>> I quickly drafted these out (sans implementation so far): >>> https://github.com/apache/arrow/pull/13492 >>> >>> On Thu, Jun 30, 2022, at 21:20, David Li wrote: >>>> Ah - somehow I didn't think of that. Yes, we should just implement it >>>> in the same way prepared statements are already implemented. >>>> >>>> On Thu, Jun 30, 2022, at 19:42, Micah Kornfield wrote: >>>>>> >>>>>> It would also then be good to make explicit the statefulness of >>>>>> connections in Flight SQL. While that is sort of an obvious constraint, >>>>>> it >>>>>> is at odds with how gRPC is usually used (especially in the presence of >>>>>> load balancing). >>>>> >>>>> >>>>> I'm not sure I understand where the statefulness requirements come in? >>>>> Could you elaborate? It seems that a transaction could be an opaque ID on >>>>> operations? >>>>> >>>>> On Thu, Jun 30, 2022 at 2:47 PM James Duong >>>>> <jam...@bitquilltech.com.invalid> >>>>> wrote: >>>>> >>>>>> This is a bit of a tangent from the original discussion about >>>>>> Substrait integration. >>>>>> >>>>>> Flight SQL would definitely benefit from transaction RPC commands for >>>>>> building bridge drivers. I'm also wondering if there should be an RPC >>>>>> call >>>>>> to cancel a running query, as opposed to just having the client terminate >>>>>> streams. This would allow a multi-process application to cancel work >>>>>> across >>>>>> processes. >>>>>> >>>>>> On Thu, Jun 30, 2022 at 1:35 PM David Li <lidav...@apache.org> wrote: >>>>>> >>>>>> > Reviving this discussion: would people be interested in seeing a >>>>>> > sketched-out CommandSubstraitQuery et. al.? >>>>>> > >>>>>> > Additionally, while working on ADBC, I realized: does Flight SQL need >>>>>> > explicit Commit/Rollback commands? This would presumably be necessary >>>>>> > if >>>>>> we >>>>>> > want to build ODBC/JDBC drivers on top, since those standards have >>>>>> explicit >>>>>> > commands, and Flight SQL doesn't have the luxury of a driver to issue >>>>>> > database-specific SQL to implement these. >>>>>> > >>>>>> > It would also then be good to make explicit the statefulness of >>>>>> > connections in Flight SQL. While that is sort of an obvious constraint, >>>>>> it >>>>>> > is at odds with how gRPC is usually used (especially in the presence of >>>>>> > load balancing). >>>>>> > >>>>>> > On Sun, Mar 6, 2022, at 14:44, Gavin Ray wrote: >>>>>> > > Got it, thank you David! >>>>>> > > I started prototyping the implementation last night, hopefully I will >>>>>> > make >>>>>> > > some good progress and have something basic functioning soon. >>>>>> > > >>>>>> > > RE: The metadata thing -- I think both Calcite and Teiid have solid >>>>>> > > interfaces for defining what capabilities a datasource has. >>>>>> > > >>>>>> > >>>>>> https://github.com/teiid/teiid/blob/8e9057a46be009d68b2d67701781f1f8c175baa7/api/src/main/java/org/teiid/translator/ExecutionFactory.java#L349-L1528 >>>>>> > > >>>>>> > > It's probably not possible to make something universal, but it seems >>>>>> like >>>>>> > > you could get pretty close to most common functionality/capabilities >>>>>> > > >>>>>> > > >>>>>> > > On Sat, Mar 5, 2022 at 11:48 PM Kyle Porter <ky...@bitquilltech.com >>>>>> > .invalid> >>>>>> > > wrote: >>>>>> > > >>>>>> > >> Yes, we should, where possible, avoid any one of metadata. This is >>>>>> where >>>>>> > >> other standards fail in that applications must be custom built for >>>>>> each >>>>>> > >> data source, if we standardize the metadata then applications can at >>>>>> > least >>>>>> > >> be built to adapt. >>>>>> > >> >>>>>> > >> On Sat., Mar. 5, 2022, 6:54 p.m. David Li, <lidav...@apache.org> >>>>>> wrote: >>>>>> > >> >>>>>> > >> > Yes, GetSqlInfo reserves a range of metadata IDs for Flight SQL's >>>>>> > use, so >>>>>> > >> > the application can use others for its own purposes. That said if >>>>>> they >>>>>> > >> seem >>>>>> > >> > commonly applicable maybe we should try to standardize them. >>>>>> > >> > >>>>>> > >> > I think what you are doing should be reasonable. You may not need >>>>>> > _all_ >>>>>> > >> of >>>>>> > >> > the capabilities in Flight SQL for this (e.g. all the various >>>>>> metadata >>>>>> > >> > calls, or prepared statements, perhaps) but I don't see why it >>>>>> > wouldn't >>>>>> > >> > work for you. >>>>>> > >> > >>>>>> > >> > On Fri, Mar 4, 2022, at 19:03, Gavin Ray wrote: >>>>>> > >> > > To touch on the question about supported features -- is it >>>>>> possible >>>>>> > to >>>>>> > >> > > advertise arbitrary/custom "capabilites" in GetSqlInfo? >>>>>> > >> > > Say that you want to represent some set of behaviors that >>>>>> FlightSQL >>>>>> > >> > > services can support. >>>>>> > >> > > >>>>>> > >> > > Stuff like "Supports grouping by multiple distinct aggregates", >>>>>> > >> "Supports >>>>>> > >> > > self-joins on aliased tables" etc >>>>>> > >> > > This is going to be unique to each implementation, but I >>>>>> > >> > > couldn't >>>>>> > >> > determine >>>>>> > >> > > whether there was a way to express arbitrary capabilities >>>>>> > >> > > >>>>>> > >> > > Also, in case it's helpful I put together an ASCII diagram of >>>>>> > >> > > what >>>>>> > I'm >>>>>> > >> > > trying to do with FlightSQL >>>>>> > >> > > If anyone has a moment, would appreciate input on whether it's >>>>>> > >> feasible/a >>>>>> > >> > > good idea >>>>>> > >> > > >>>>>> > >> > > https://pastebin.com/raw/VF2r0F3f >>>>>> > >> > > >>>>>> > >> > > Thank you =) >>>>>> > >> > > >>>>>> > >> > > >>>>>> > >> > > On Fri, Mar 4, 2022 at 2:37 PM David Li <lidav...@apache.org> >>>>>> > wrote: >>>>>> > >> > > >>>>>> > >> > >> We could also add say CommandSubstraitQuery as a distinct >>>>>> message, >>>>>> > and >>>>>> > >> > >> older servers would just reject it as an unknown request type. >>>>>> > >> > >> >>>>>> > >> > >> -David >>>>>> > >> > >> >>>>>> > >> > >> On Fri, Mar 4, 2022, at 17:01, Micah Kornfield wrote: >>>>>> > >> > >> >> >>>>>> > >> > >> >> 1. How does a server report that it supports each command >>>>>> type? >>>>>> > >> > Initial >>>>>> > >> > >> >> thought is a property in GetSqlInfo. >>>>>> > >> > >> > >>>>>> > >> > >> > >>>>>> > >> > >> > This sounds reasonable. >>>>>> > >> > >> > >>>>>> > >> > >> > >>>>>> > >> > >> >> What happens to client code written prior to changing the >>>>>> > command >>>>>> > >> > type >>>>>> > >> > >> >> to be a oneOf field? Same for servers. >>>>>> > >> > >> > >>>>>> > >> > >> > >>>>>> > >> > >> > It is transparent from older clients (I'm 99% sure the wire >>>>>> > protocol >>>>>> > >> > >> > doesn't change). Servers is a little harder. The one saving >>>>>> > grace >>>>>> > >> > is I >>>>>> > >> > >> > don't think an empty/not-present SQL string would be >>>>>> > >> > >> > something >>>>>> > most >>>>>> > >> > >> servers >>>>>> > >> > >> > could handle, so they would probably error with something >>>>>> > >> > >> > that >>>>>> > while >>>>>> > >> > >> > not-obvious would give a clue to the clients (but hopefully >>>>>> this >>>>>> > >> would >>>>>> > >> > >> be a >>>>>> > >> > >> > non-issue because the capabilities would be checked for >>>>>> > >> > >> > clients >>>>>> > >> > wishing >>>>>> > >> > >> to >>>>>> > >> > >> > to use this feature first). >>>>>> > >> > >> > >>>>>> > >> > >> > -Micah >>>>>> > >> > >> > >>>>>> > >> > >> > On Fri, Mar 4, 2022 at 1:50 PM James Duong < >>>>>> > jam...@bitquilltech.com >>>>>> > >> > >> .invalid> >>>>>> > >> > >> > wrote: >>>>>> > >> > >> > >>>>>> > >> > >> >> It sounds like an interesting and useful project to use >>>>>> > Subtstrait >>>>>> > >> > as an >>>>>> > >> > >> >> alternative to SQL strings. >>>>>> > >> > >> >> >>>>>> > >> > >> >> Important aspects to spec out are: >>>>>> > >> > >> >> 1. How does a server report that it supports each command >>>>>> type? >>>>>> > >> > Initial >>>>>> > >> > >> >> thought is a property in GetSqlInfo. >>>>>> > >> > >> >> 2. What happens to client code written prior to changing the >>>>>> > >> command >>>>>> > >> > >> type >>>>>> > >> > >> >> to be a oneOf field? Same for servers. >>>>>> > >> > >> >> More generally, how should backward compatibility work, and >>>>>> what >>>>>> > >> > should >>>>>> > >> > >> >> happen if a client sends an unsupported >>>>>> > >> > >> >> command type to a server. >>>>>> > >> > >> >> 3. Should inputs to catalog RPC calls also accept Substrait >>>>>> > >> > structures? >>>>>> > >> > >> >> >>>>>> > >> > >> >> On Thu, Mar 3, 2022 at 11:00 PM Gavin Ray < >>>>>> > ray.gavi...@gmail.com> >>>>>> > >> > >> wrote: >>>>>> > >> > >> >> >>>>>> > >> > >> >> > @James Duong <jam...@bitquilltech.com> >>>>>> > >> > >> >> > >>>>>> > >> > >> >> > You are absolutely right, I realized this and confirmed >>>>>> > whether >>>>>> > >> > this >>>>>> > >> > >> >> > would be possible with Jacques to double-check. >>>>>> > >> > >> >> > It would amount to what I might call "dollar-store >>>>>> Substrait." >>>>>> > >> It's >>>>>> > >> > >> not >>>>>> > >> > >> >> > elegant or a good solution, but definitely presents a good >>>>>> > >> > duct-tape >>>>>> > >> > >> hack >>>>>> > >> > >> >> > and is a crafty idea. >>>>>> > >> > >> >> > >>>>>> > >> > >> >> > I agree with Jacques -- when you think about FlightSQL, >>>>>> > >> > >> >> > what >>>>>> > you >>>>>> > >> > are >>>>>> > >> > >> >> > attempting with a query isn't necessarily SQL, but a >>>>>> > >> > >> >> > general >>>>>> > >> > >> data-compute >>>>>> > >> > >> >> > operation. >>>>>> > >> > >> >> > SQL just so happens to be a fairly universal way to >>>>>> > >> > >> >> > express >>>>>> > them, >>>>>> > >> > >> with an >>>>>> > >> > >> >> > ANSI standard, but FlightSQL doesn't recognize any >>>>>> particular >>>>>> > >> > subset >>>>>> > >> > >> of >>>>>> > >> > >> >> it >>>>>> > >> > >> >> > and for all intents and purposes it doesn't matter what >>>>>> > >> > >> >> > the >>>>>> > >> > operation >>>>>> > >> > >> >> > string contains. >>>>>> > >> > >> >> > >>>>>> > >> > >> >> > Substrait would make a fantastic logical next-feature >>>>>> because >>>>>> > >> it's >>>>>> > >> > >> >> > targeted as a specification for expressing relational >>>>>> algebra >>>>>> > and >>>>>> > >> > >> >> > data-compute operations >>>>>> > >> > >> >> > This more-or-less equates to SQL strings (in my mind at >>>>>> least) >>>>>> > >> > with a >>>>>> > >> > >> >> much >>>>>> > >> > >> >> > better toolkit and Dev UX. If there is anything I can do >>>>>> > >> > >> >> > to >>>>>> > help >>>>>> > >> > move >>>>>> > >> > >> >> this >>>>>> > >> > >> >> > forward, please let me know because I am extremely >>>>>> > >> > >> >> > motivated >>>>>> > to >>>>>> > >> do >>>>>> > >> > so. >>>>>> > >> > >> >> > >>>>>> > >> > >> >> > @David Li <git...@lidavidm.me> >>>>>> > >> > >> >> > >>>>>> > >> > >> >> > Also agreed. Substrait is put together by folks much >>>>>> > >> > >> >> > smarter >>>>>> > than >>>>>> > >> > >> myself, >>>>>> > >> > >> >> > and if I had to hedge my bets, I'd put money on it being >>>>>> > >> > >> >> > the >>>>>> > >> > future of >>>>>> > >> > >> >> > data-compute interop. >>>>>> > >> > >> >> > I would love nothing more than to adopt this technology >>>>>> > >> > >> >> > and >>>>>> > push >>>>>> > >> it >>>>>> > >> > >> >> along. >>>>>> > >> > >> >> > >>>>>> > >> > >> >> > Your project does sound interesting - basically, it sounds >>>>>> > like a >>>>>> > >> > >> tabular >>>>>> > >> > >> >> >> data storage service with query pushdown? >>>>>> > >> > >> >> >> >>>>>> > >> > >> >> > >>>>>> > >> > >> >> > Yeah this is more or less the details of it (my personal >>>>>> > email, >>>>>> > >> > with >>>>>> > >> > >> >> > discretion assumed, is always open) >>>>>> > >> > >> >> > >>>>>> > >> > >> >> > Imagine an environment where a backend wants to advertise >>>>>> some >>>>>> > >> > kind of >>>>>> > >> > >> >> > schema/data catalog >>>>>> > >> > >> >> > >>>>>> > >> > >> >> > And then a central service introspects these backends, and >>>>>> > >> > dynamically >>>>>> > >> > >> >> > generates an API from the data catalogues/schemas, where >>>>>> > requests >>>>>> > >> > get >>>>>> > >> > >> >> > proxied to the underlying backend service for each schema >>>>>> > >> > >> >> > to >>>>>> > >> > actually >>>>>> > >> > >> be >>>>>> > >> > >> >> > executed >>>>>> > >> > >> >> > >>>>>> > >> > >> >> > In text, the flow would look something like: >>>>>> > >> > >> >> > >>>>>> > >> > >> >> > >>>>>> > >> > >> >> > <----> Data Provider Backend 0 >>>>>> > >> > >> >> > Client <-----> Central Service <---> Generated API <----> >>>>>> > >> > >> Data-Provider >>>>>> > >> > >> >> > Backend 1 >>>>>> > >> > >> >> > >>>>>> > >> > >> >> > <----> Data Provider Backend 2 >>>>>> > >> > >> >> > >>>>>> > >> > >> >> > >>>>>> > >> > >> >> > >>>>>> > >> > >> >> > On Thu, Mar 3, 2022 at 5:52 PM David Li < >>>>>> lidav...@apache.org> >>>>>> > >> > wrote: >>>>>> > >> > >> >> > >>>>>> > >> > >> >> >> Gavin, thanks for sharing. I'm not so sure you'll find an >>>>>> > >> > >> alternative to >>>>>> > >> > >> >> >> Substrait, at least one that isn't even more nascent or >>>>>> > >> > >> >> >> one >>>>>> > >> that's >>>>>> > >> > >> very >>>>>> > >> > >> >> >> tied to a particular language, so perhaps it might be >>>>>> better >>>>>> > to >>>>>> > >> > get >>>>>> > >> > >> >> >> involved in Substrait and see if it suits your needs? >>>>>> > >> Convincing a >>>>>> > >> > >> team >>>>>> > >> > >> >> to >>>>>> > >> > >> >> >> try something new can be hard, though, and it is somewhat >>>>>> of >>>>>> > a >>>>>> > >> > moving >>>>>> > >> > >> >> >> target - but Flight SQL is in a similar spot, I think, as >>>>>> > it's >>>>>> > >> > still >>>>>> > >> > >> >> >> getting enhancements. >>>>>> > >> > >> >> >> >>>>>> > >> > >> >> >> Your project does sound interesting - basically, it >>>>>> > >> > >> >> >> sounds >>>>>> > like >>>>>> > >> a >>>>>> > >> > >> >> tabular >>>>>> > >> > >> >> >> data storage service with query pushdown? >>>>>> > >> > >> >> >> >>>>>> > >> > >> >> >> On Thu, Mar 3, 2022, at 19:58, Jacques Nadeau wrote: >>>>>> > >> > >> >> >> > James, I agree that you could use JSON but that feels a >>>>>> bit >>>>>> > >> > hacky >>>>>> > >> > >> >> >> > (mis-use >>>>>> > >> > >> >> >> > of the paradigm). Instead, I'd really like to do >>>>>> something >>>>>> > >> like >>>>>> > >> > >> David >>>>>> > >> > >> >> is >>>>>> > >> > >> >> >> > suggesting: support Substrait as an alternative to a >>>>>> > >> > >> >> >> > SQL >>>>>> > >> string. >>>>>> > >> > >> >> >> > Something like this: >>>>>> > >> > >> >> >> > >>>>>> > >> > >> >> >> >>>>>> > >> > >> >> >>>>>> > >> > >> >>>>>> > >> > >>>>>> > >> >>>>>> > >>>>>> https://github.com/jacques-n/arrow/commit/e22674fa882e77c2889cf95f69f6e3701db362bc >>>>>> > >> > >> >> >> > >>>>>> > >> > >> >> >> > It would be great if someone wanted to pick this up. It >>>>>> > would >>>>>> > >> > be a >>>>>> > >> > >> >> nice >>>>>> > >> > >> >> >> > enhancement to FlightSQL (and provide a structured way >>>>>> > >> > >> >> >> > to >>>>>> > >> > express >>>>>> > >> > >> >> >> > operations). >>>>>> > >> > >> >> >> > >>>>>> > >> > >> >> >> > >>>>>> > >> > >> >> >> > >>>>>> > >> > >> >> >> > On Thu, Mar 3, 2022 at 4:56 PM James Duong < >>>>>> > >> > >> jam...@bitquilltech.com >>>>>> > >> > >> >> >> .invalid> >>>>>> > >> > >> >> >> > wrote: >>>>>> > >> > >> >> >> > >>>>>> > >> > >> >> >> >> In the same way that you could write an ODBC driver >>>>>> > >> > >> >> >> >> that >>>>>> > >> takes >>>>>> > >> > in >>>>>> > >> > >> >> text >>>>>> > >> > >> >> >> >> that's not SQL, you could write a Flight SQL server >>>>>> > >> > >> >> >> >> that >>>>>> > >> takes >>>>>> > >> > in >>>>>> > >> > >> >> text >>>>>> > >> > >> >> >> >> that's JSON. >>>>>> > >> > >> >> >> >> Flight SQL doesn't parse the query, so you could >>>>>> > >> > >> >> >> >> create >>>>>> > >> > commands >>>>>> > >> > >> that >>>>>> > >> > >> >> >> are >>>>>> > >> > >> >> >> >> just JSON text. >>>>>> > >> > >> >> >> >> >>>>>> > >> > >> >> >> >> Is that the only bit you need, Gavin? >>>>>> > >> > >> >> >> >> >>>>>> > >> > >> >> >> >> On Thu, Mar 3, 2022 at 4:26 PM Gavin Ray < >>>>>> > >> > ray.gavi...@gmail.com> >>>>>> > >> > >> >> >> wrote: >>>>>> > >> > >> >> >> >> >>>>>> > >> > >> >> >> >> > I am enthusiastic about Substrait and have followed >>>>>> it's >>>>>> > >> > >> progress >>>>>> > >> > >> >> >> eagerly >>>>>> > >> > >> >> >> >> > =D >>>>>> > >> > >> >> >> >> > >>>>>> > >> > >> >> >> >> > When I presented it as a tentative option, there >>>>>> > >> > >> >> >> >> > were >>>>>> > >> > >> reservations >>>>>> > >> > >> >> >> >> because >>>>>> > >> > >> >> >> >> > of the project/spec being young and the >>>>>> > >> > >> >> >> >> > functionality >>>>>> > still >>>>>> > >> > >> being >>>>>> > >> > >> >> >> >> > fleshed out. >>>>>> > >> > >> >> >> >> > I think if I were having this conversation in say, >>>>>> 8-16 >>>>>> > >> > months, >>>>>> > >> > >> it >>>>>> > >> > >> >> >> would >>>>>> > >> > >> >> >> >> > have been an easy choice, no doubt. >>>>>> > >> > >> >> >> >> > >>>>>> > >> > >> >> >> >> > On a public mailing list (and I can share more >>>>>> > >> > >> >> >> >> > details >>>>>> > in >>>>>> > >> > >> private >>>>>> > >> > >> >> if >>>>>> > >> > >> >> >> >> you're >>>>>> > >> > >> >> >> >> > curious), the gist of it is this: >>>>>> > >> > >> >> >> >> > >>>>>> > >> > >> >> >> >> > Some well-defined/backed-by-mature tech solution for >>>>>> > >> > expressing >>>>>> > >> > >> >> data >>>>>> > >> > >> >> >> >> > compute operations between services would be a >>>>>> > >> > >> >> >> >> > useful >>>>>> > thing >>>>>> > >> > to >>>>>> > >> > >> have >>>>>> > >> > >> >> >> >> > (Especially if it's language-agnostic) >>>>>> > >> > >> >> >> >> > >>>>>> > >> > >> >> >> >> > The goal is for an "implementing service" to have: >>>>>> > >> > >> >> >> >> > - An introspectable schema (IE, "describe yourself >>>>>> > >> > >> >> >> >> > to >>>>>> > me") >>>>>> > >> > >> >> >> >> > - A query/operation execution endpoint (IE: "perform >>>>>> > this >>>>>> > >> > >> operation >>>>>> > >> > >> >> >> on >>>>>> > >> > >> >> >> >> your >>>>>> > >> > >> >> >> >> > data") >>>>>> > >> > >> >> >> >> > >>>>>> > >> > >> >> >> >> > With FlightSQL this is possible I believe, but it >>>>>> > requires >>>>>> > >> > the >>>>>> > >> > >> >> >> operation >>>>>> > >> > >> >> >> >> to >>>>>> > >> > >> >> >> >> > be expressed as a SQL string which isn't ideal. >>>>>> > >> > >> >> >> >> > >>>>>> > >> > >> >> >> >> > Working with some programmatic, structured object >>>>>> > >> > >> >> >> >> > that >>>>>> > has >>>>>> > >> > the >>>>>> > >> > >> same >>>>>> > >> > >> >> >> >> > semantics ("Logical Plan", or whatnot) as a SQL >>>>>> > >> > >> >> >> >> > query >>>>>> > would >>>>>> > >> > >> have, >>>>>> > >> > >> >> >> would >>>>>> > >> > >> >> >> >> be >>>>>> > >> > >> >> >> >> > a better experience >>>>>> > >> > >> >> >> >> > (Jacques is on to something here!) >>>>>> > >> > >> >> >> >> > >>>>>> > >> > >> >> >> >> > This interface between services would be somewhat >>>>>> > >> > >> >> >> >> > the >>>>>> > >> > >> equivalent of >>>>>> > >> > >> >> >> an >>>>>> > >> > >> >> >> >> > "SDK", so it would be nice to have a strongly-typed >>>>>> > library >>>>>> > >> > for >>>>>> > >> > >> >> >> >> expressing >>>>>> > >> > >> >> >> >> > and building-up query/data-compute ops. >>>>>> > >> > >> >> >> >> > >>>>>> > >> > >> >> >> >> > >>>>>> > >> > >> >> >> >> > On Thu, Mar 3, 2022 at 3:17 PM David Li < >>>>>> > >> lidav...@apache.org >>>>>> > >> > > >>>>>> > >> > >> >> wrote: >>>>>> > >> > >> >> >> >> > >>>>>> > >> > >> >> >> >> > > You probably want Substrait: https://substrait.io/ >>>>>> > >> > >> >> >> >> > > >>>>>> > >> > >> >> >> >> > > Which is being worked on by several people, >>>>>> including >>>>>> > >> Arrow >>>>>> > >> > >> >> >> community >>>>>> > >> > >> >> >> >> > > members. >>>>>> > >> > >> >> >> >> > > >>>>>> > >> > >> >> >> >> > > It might be interesting to generalize Flight SQL >>>>>> > >> > >> >> >> >> > > to >>>>>> > >> include >>>>>> > >> > >> >> >> support for >>>>>> > >> > >> >> >> >> > > Substrait. I'm curious what your application, if >>>>>> > you're >>>>>> > >> > able >>>>>> > >> > >> to >>>>>> > >> > >> >> >> share >>>>>> > >> > >> >> >> >> > more. >>>>>> > >> > >> >> >> >> > > >>>>>> > >> > >> >> >> >> > > -David >>>>>> > >> > >> >> >> >> > > >>>>>> > >> > >> >> >> >> > > On Thu, Mar 3, 2022, at 18:05, Gavin Ray wrote: >>>>>> > >> > >> >> >> >> > > > Hiya, >>>>>> > >> > >> >> >> >> > > > >>>>>> > >> > >> >> >> >> > > > I am drafting a proposal for a way to enable >>>>>> > services >>>>>> > >> to >>>>>> > >> > >> >> express >>>>>> > >> > >> >> >> data >>>>>> > >> > >> >> >> >> > > > compute operations to each other. >>>>>> > >> > >> >> >> >> > > > >>>>>> > >> > >> >> >> >> > > > However I think it'll be difficult to get buy-in >>>>>> if >>>>>> > the >>>>>> > >> > only >>>>>> > >> > >> >> >> >> > > representation >>>>>> > >> > >> >> >> >> > > > for queries is as SQL strings. >>>>>> > >> > >> >> >> >> > > > >>>>>> > >> > >> >> >> >> > > > Is there any kind of lower-level API that can be >>>>>> > used >>>>>> > >> to >>>>>> > >> > >> >> express >>>>>> > >> > >> >> >> >> > > operations? >>>>>> > >> > >> >> >> >> > > > >>>>>> > >> > >> >> >> >> > > > IE instead of "SELECT name FROM user" >>>>>> > >> > >> >> >> >> > > > >>>>>> > >> > >> >> >> >> > > > A structured representation like: >>>>>> > >> > >> >> >> >> > > > { >>>>>> > >> > >> >> >> >> > > > "op": "query", >>>>>> > >> > >> >> >> >> > > > "schema": "user", >>>>>> > >> > >> >> >> >> > > > "project": ["name"] >>>>>> > >> > >> >> >> >> > > > } >>>>>> > >> > >> >> >> >> > > > >>>>>> > >> > >> >> >> >> > > > Or maybe this is a bad idea/doesn't make sense? >>>>>> > >> > >> >> >> >> > > > >>>>>> > >> > >> >> >> >> > > > Thank you =) >>>>>> > >> > >> >> >> >> > > >>>>>> > >> > >> >> >> >> > >>>>>> > >> > >> >> >> >> >>>>>> > >> > >> >> >> >> >>>>>> > >> > >> >> >> >> -- >>>>>> > >> > >> >> >> >> >>>>>> > >> > >> >> >> >> *James Duong* >>>>>> > >> > >> >> >> >> Lead Software Developer >>>>>> > >> > >> >> >> >> Bit Quill Technologies Inc. >>>>>> > >> > >> >> >> >> Direct: +1.604.562.6082 | jam...@bitquilltech.com >>>>>> > >> > >> >> >> >> https://www.bitquilltech.com >>>>>> > >> > >> >> >> >> >>>>>> > >> > >> >> >> >> This email message is for the sole use of the intended >>>>>> > >> > >> recipient(s) >>>>>> > >> > >> >> >> and may >>>>>> > >> > >> >> >> >> contain confidential and privileged information. Any >>>>>> > >> > unauthorized >>>>>> > >> > >> >> >> review, >>>>>> > >> > >> >> >> >> use, disclosure, or distribution is prohibited. If >>>>>> > >> > >> >> >> >> you >>>>>> > are >>>>>> > >> not >>>>>> > >> > >> the >>>>>> > >> > >> >> >> >> intended recipient, please contact the sender by reply >>>>>> > email >>>>>> > >> > and >>>>>> > >> > >> >> >> destroy >>>>>> > >> > >> >> >> >> all copies of the original message. Thank you. >>>>>> > >> > >> >> >> >> >>>>>> > >> > >> >> >> >>>>>> > >> > >> >> > >>>>>> > >> > >> >> >>>>>> > >> > >> >> -- >>>>>> > >> > >> >> >>>>>> > >> > >> >> *James Duong* >>>>>> > >> > >> >> Lead Software Developer >>>>>> > >> > >> >> Bit Quill Technologies Inc. >>>>>> > >> > >> >> Direct: +1.604.562.6082 | jam...@bitquilltech.com >>>>>> > >> > >> >> https://www.bitquilltech.com >>>>>> > >> > >> >> >>>>>> > >> > >> >> This email message is for the sole use of the intended >>>>>> > recipient(s) >>>>>> > >> > and >>>>>> > >> > >> may >>>>>> > >> > >> >> contain confidential and privileged information. Any >>>>>> > unauthorized >>>>>> > >> > >> review, >>>>>> > >> > >> >> use, disclosure, or distribution is prohibited. If you are >>>>>> not >>>>>> > the >>>>>> > >> > >> >> intended recipient, please contact the sender by reply email >>>>>> and >>>>>> > >> > destroy >>>>>> > >> > >> >> all copies of the original message. Thank you. >>>>>> > >> > >> >> >>>>>> > >> > >> >>>>>> > >> > >>>>>> > >> >>>>>> > >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> *James Duong* >>>>>> Lead Software Developer >>>>>> Bit Quill Technologies Inc. >>>>>> Direct: +1.604.562.6082 | jam...@bitquilltech.com >>>>>> https://www.bitquilltech.com >>>>>> >>>>>> This email message is for the sole use of the intended recipient(s) and >>>>>> may >>>>>> contain confidential and privileged information. Any unauthorized >>>>>> review, >>>>>> use, disclosure, or distribution is prohibited. If you are not the >>>>>> intended recipient, please contact the sender by reply email and destroy >>>>>> all copies of the original message. Thank you. >>>>>>