> > 1. How does a server report that it supports each command type? Initial > thought is a property in GetSqlInfo.
This sounds reasonable. > What happens to client code written prior to changing the command type > to be a oneOf field? Same for servers. It is transparent from older clients (I'm 99% sure the wire protocol doesn't change). Servers is a little harder. The one saving grace is I don't think an empty/not-present SQL string would be something most servers could handle, so they would probably error with something that while not-obvious would give a clue to the clients (but hopefully this would be a non-issue because the capabilities would be checked for clients wishing to to use this feature first). -Micah On Fri, Mar 4, 2022 at 1:50 PM James Duong <jam...@bitquilltech.com.invalid> wrote: > It sounds like an interesting and useful project to use Subtstrait as an > alternative to SQL strings. > > Important aspects to spec out are: > 1. How does a server report that it supports each command type? Initial > thought is a property in GetSqlInfo. > 2. What happens to client code written prior to changing the command type > to be a oneOf field? Same for servers. > More generally, how should backward compatibility work, and what should > happen if a client sends an unsupported > command type to a server. > 3. Should inputs to catalog RPC calls also accept Substrait structures? > > On Thu, Mar 3, 2022 at 11:00 PM Gavin Ray <ray.gavi...@gmail.com> wrote: > > > @James Duong <jam...@bitquilltech.com> > > > > You are absolutely right, I realized this and confirmed whether this > > would be possible with Jacques to double-check. > > It would amount to what I might call "dollar-store Substrait." It's not > > elegant or a good solution, but definitely presents a good duct-tape hack > > and is a crafty idea. > > > > I agree with Jacques -- when you think about FlightSQL, what you are > > attempting with a query isn't necessarily SQL, but a general data-compute > > operation. > > SQL just so happens to be a fairly universal way to express them, with an > > ANSI standard, but FlightSQL doesn't recognize any particular subset of > it > > and for all intents and purposes it doesn't matter what the operation > > string contains. > > > > Substrait would make a fantastic logical next-feature because it's > > targeted as a specification for expressing relational algebra and > > data-compute operations > > This more-or-less equates to SQL strings (in my mind at least) with a > much > > better toolkit and Dev UX. If there is anything I can do to help move > this > > forward, please let me know because I am extremely motivated to do so. > > > > @David Li <git...@lidavidm.me> > > > > Also agreed. Substrait is put together by folks much smarter than myself, > > and if I had to hedge my bets, I'd put money on it being the future of > > data-compute interop. > > I would love nothing more than to adopt this technology and push it > along. > > > > Your project does sound interesting - basically, it sounds like a tabular > >> data storage service with query pushdown? > >> > > > > Yeah this is more or less the details of it (my personal email, with > > discretion assumed, is always open) > > > > Imagine an environment where a backend wants to advertise some kind of > > schema/data catalog > > > > And then a central service introspects these backends, and dynamically > > generates an API from the data catalogues/schemas, where requests get > > proxied to the underlying backend service for each schema to actually be > > executed > > > > In text, the flow would look something like: > > > > > > <----> Data Provider Backend 0 > > Client <-----> Central Service <---> Generated API <----> Data-Provider > > Backend 1 > > > > <----> Data Provider Backend 2 > > > > > > > > On Thu, Mar 3, 2022 at 5:52 PM David Li <lidav...@apache.org> wrote: > > > >> Gavin, thanks for sharing. I'm not so sure you'll find an alternative to > >> Substrait, at least one that isn't even more nascent or one that's very > >> tied to a particular language, so perhaps it might be better to get > >> involved in Substrait and see if it suits your needs? Convincing a team > to > >> try something new can be hard, though, and it is somewhat of a moving > >> target - but Flight SQL is in a similar spot, I think, as it's still > >> getting enhancements. > >> > >> Your project does sound interesting - basically, it sounds like a > tabular > >> data storage service with query pushdown? > >> > >> On Thu, Mar 3, 2022, at 19:58, Jacques Nadeau wrote: > >> > James, I agree that you could use JSON but that feels a bit hacky > >> > (mis-use > >> > of the paradigm). Instead, I'd really like to do something like David > is > >> > suggesting: support Substrait as an alternative to a SQL string. > >> > Something like this: > >> > > >> > https://github.com/jacques-n/arrow/commit/e22674fa882e77c2889cf95f69f6e3701db362bc > >> > > >> > It would be great if someone wanted to pick this up. It would be a > nice > >> > enhancement to FlightSQL (and provide a structured way to express > >> > operations). > >> > > >> > > >> > > >> > On Thu, Mar 3, 2022 at 4:56 PM James Duong <jam...@bitquilltech.com > >> .invalid> > >> > wrote: > >> > > >> >> In the same way that you could write an ODBC driver that takes in > text > >> >> that's not SQL, you could write a Flight SQL server that takes in > text > >> >> that's JSON. > >> >> Flight SQL doesn't parse the query, so you could create commands that > >> are > >> >> just JSON text. > >> >> > >> >> Is that the only bit you need, Gavin? > >> >> > >> >> On Thu, Mar 3, 2022 at 4:26 PM Gavin Ray <ray.gavi...@gmail.com> > >> wrote: > >> >> > >> >> > I am enthusiastic about Substrait and have followed it's progress > >> eagerly > >> >> > =D > >> >> > > >> >> > When I presented it as a tentative option, there were reservations > >> >> because > >> >> > of the project/spec being young and the functionality still being > >> >> > fleshed out. > >> >> > I think if I were having this conversation in say, 8-16 months, it > >> would > >> >> > have been an easy choice, no doubt. > >> >> > > >> >> > On a public mailing list (and I can share more details in private > if > >> >> you're > >> >> > curious), the gist of it is this: > >> >> > > >> >> > Some well-defined/backed-by-mature tech solution for expressing > data > >> >> > compute operations between services would be a useful thing to have > >> >> > (Especially if it's language-agnostic) > >> >> > > >> >> > The goal is for an "implementing service" to have: > >> >> > - An introspectable schema (IE, "describe yourself to me") > >> >> > - A query/operation execution endpoint (IE: "perform this operation > >> on > >> >> your > >> >> > data") > >> >> > > >> >> > With FlightSQL this is possible I believe, but it requires the > >> operation > >> >> to > >> >> > be expressed as a SQL string which isn't ideal. > >> >> > > >> >> > Working with some programmatic, structured object that has the same > >> >> > semantics ("Logical Plan", or whatnot) as a SQL query would have, > >> would > >> >> be > >> >> > a better experience > >> >> > (Jacques is on to something here!) > >> >> > > >> >> > This interface between services would be somewhat the equivalent of > >> an > >> >> > "SDK", so it would be nice to have a strongly-typed library for > >> >> expressing > >> >> > and building-up query/data-compute ops. > >> >> > > >> >> > > >> >> > On Thu, Mar 3, 2022 at 3:17 PM David Li <lidav...@apache.org> > wrote: > >> >> > > >> >> > > You probably want Substrait: https://substrait.io/ > >> >> > > > >> >> > > Which is being worked on by several people, including Arrow > >> community > >> >> > > members. > >> >> > > > >> >> > > It might be interesting to generalize Flight SQL to include > >> support for > >> >> > > Substrait. I'm curious what your application, if you're able to > >> share > >> >> > more. > >> >> > > > >> >> > > -David > >> >> > > > >> >> > > On Thu, Mar 3, 2022, at 18:05, Gavin Ray wrote: > >> >> > > > Hiya, > >> >> > > > > >> >> > > > I am drafting a proposal for a way to enable services to > express > >> data > >> >> > > > compute operations to each other. > >> >> > > > > >> >> > > > However I think it'll be difficult to get buy-in if the only > >> >> > > representation > >> >> > > > for queries is as SQL strings. > >> >> > > > > >> >> > > > Is there any kind of lower-level API that can be used to > express > >> >> > > operations? > >> >> > > > > >> >> > > > IE instead of "SELECT name FROM user" > >> >> > > > > >> >> > > > A structured representation like: > >> >> > > > { > >> >> > > > "op": "query", > >> >> > > > "schema": "user", > >> >> > > > "project": ["name"] > >> >> > > > } > >> >> > > > > >> >> > > > Or maybe this is a bad idea/doesn't make sense? > >> >> > > > > >> >> > > > Thank you =) > >> >> > > > >> >> > > >> >> > >> >> > >> >> -- > >> >> > >> >> *James Duong* > >> >> Lead Software Developer > >> >> Bit Quill Technologies Inc. > >> >> Direct: +1.604.562.6082 | jam...@bitquilltech.com > >> >> https://www.bitquilltech.com > >> >> > >> >> This email message is for the sole use of the intended recipient(s) > >> and may > >> >> contain confidential and privileged information. Any unauthorized > >> review, > >> >> use, disclosure, or distribution is prohibited. If you are not the > >> >> intended recipient, please contact the sender by reply email and > >> destroy > >> >> all copies of the original message. Thank you. > >> >> > >> > > > > -- > > *James Duong* > Lead Software Developer > Bit Quill Technologies Inc. > Direct: +1.604.562.6082 | jam...@bitquilltech.com > https://www.bitquilltech.com > > This email message is for the sole use of the intended recipient(s) and may > contain confidential and privileged information. Any unauthorized review, > use, disclosure, or distribution is prohibited. If you are not the > intended recipient, please contact the sender by reply email and destroy > all copies of the original message. Thank you. >