We could also add say CommandSubstraitQuery as a distinct message, and older 
servers would just reject it as an unknown request type.

-David

On Fri, Mar 4, 2022, at 17:01, Micah Kornfield wrote:
>>
>> 1. How does a server report that it supports each command type? Initial
>> thought is a property in GetSqlInfo.
>
>
> This sounds reasonable.
>
>
>> What happens to client code written prior to changing the command type
>> to be a oneOf field? Same for servers.
>
>
> It is transparent from older clients (I'm 99% sure the wire protocol
> doesn't change).  Servers is a little harder.  The one saving grace is I
> don't think an empty/not-present SQL string would be something most servers
> could handle, so they would probably error with something that while
> not-obvious would give a clue to the clients (but hopefully this would be a
> non-issue because the capabilities would be checked for clients wishing to
> to use this feature first).
>
> -Micah
>
> On Fri, Mar 4, 2022 at 1:50 PM James Duong <jam...@bitquilltech.com.invalid>
> wrote:
>
>> It sounds like an interesting and useful project to use Subtstrait as an
>> alternative to SQL strings.
>>
>> Important aspects to spec out are:
>> 1. How does a server report that it supports each command type? Initial
>> thought is a property in GetSqlInfo.
>> 2. What happens to client code written prior to changing the command type
>> to be a oneOf field? Same for servers.
>> More generally, how should backward compatibility work, and what should
>> happen if a client sends an unsupported
>> command type to a server.
>> 3. Should inputs to catalog RPC calls also accept Substrait structures?
>>
>> On Thu, Mar 3, 2022 at 11:00 PM Gavin Ray <ray.gavi...@gmail.com> wrote:
>>
>> > @James Duong <jam...@bitquilltech.com>
>> >
>> > You are absolutely right, I realized this and confirmed whether this
>> > would be possible with Jacques to double-check.
>> > It would amount to what I might call "dollar-store Substrait." It's not
>> > elegant or a good solution, but definitely presents a good duct-tape hack
>> > and is a crafty idea.
>> >
>> > I agree with Jacques -- when you think about FlightSQL, what you are
>> > attempting with a query isn't necessarily SQL, but a general data-compute
>> > operation.
>> > SQL just so happens to be a fairly universal way to express them, with an
>> > ANSI standard, but FlightSQL doesn't recognize any particular subset of
>> it
>> > and for all intents and purposes it doesn't matter what the operation
>> > string contains.
>> >
>> > Substrait would make a fantastic logical next-feature because it's
>> > targeted as a specification for expressing relational algebra and
>> > data-compute operations
>> > This more-or-less equates to SQL strings (in my mind at least) with a
>> much
>> > better toolkit and Dev UX. If there is anything I can do to help move
>> this
>> > forward, please let me know because I am extremely motivated to do so.
>> >
>> > @David Li <git...@lidavidm.me>
>> >
>> > Also agreed. Substrait is put together by folks much smarter than myself,
>> > and if I had to hedge my bets, I'd put money on it being the future of
>> > data-compute interop.
>> > I would love nothing more than to adopt this technology and push it
>> along.
>> >
>> > Your project does sound interesting - basically, it sounds like a tabular
>> >> data storage service with query pushdown?
>> >>
>> >
>> > Yeah this is more or less the details of it (my personal email, with
>> > discretion assumed, is always open)
>> >
>> > Imagine an environment where a backend wants to advertise some kind of
>> > schema/data catalog
>> >
>> > And then a central service introspects these backends, and dynamically
>> > generates an API from the data catalogues/schemas, where requests get
>> > proxied to the underlying backend service for each schema to actually be
>> > executed
>> >
>> > In text, the flow would look something like:
>> >
>> >
>> >        <----> Data Provider Backend 0
>> > Client <-----> Central Service <---> Generated API <----> Data-Provider
>> > Backend 1
>> >
>> >        <----> Data Provider Backend 2
>> >
>> >
>> >
>> > On Thu, Mar 3, 2022 at 5:52 PM David Li <lidav...@apache.org> wrote:
>> >
>> >> Gavin, thanks for sharing. I'm not so sure you'll find an alternative to
>> >> Substrait, at least one that isn't even more nascent or one that's very
>> >> tied to a particular language, so perhaps it might be better to get
>> >> involved in Substrait and see if it suits your needs? Convincing a team
>> to
>> >> try something new can be hard, though, and it is somewhat of a moving
>> >> target - but Flight SQL is in a similar spot, I think, as it's still
>> >> getting enhancements.
>> >>
>> >> Your project does sound interesting - basically, it sounds like a
>> tabular
>> >> data storage service with query pushdown?
>> >>
>> >> On Thu, Mar 3, 2022, at 19:58, Jacques Nadeau wrote:
>> >> > James, I agree that you could use JSON but that feels a bit hacky
>> >> > (mis-use
>> >> > of the paradigm). Instead, I'd really like to do something like David
>> is
>> >> > suggesting: support Substrait as an alternative to a SQL string.
>> >> > Something like this:
>> >> >
>> >>
>> https://github.com/jacques-n/arrow/commit/e22674fa882e77c2889cf95f69f6e3701db362bc
>> >> >
>> >> > It would be great if someone wanted to pick this up. It would be a
>> nice
>> >> > enhancement to FlightSQL (and provide a structured way to express
>> >> > operations).
>> >> >
>> >> >
>> >> >
>> >> > On Thu, Mar 3, 2022 at 4:56 PM James Duong <jam...@bitquilltech.com
>> >> .invalid>
>> >> > wrote:
>> >> >
>> >> >> In the same way that you could write an ODBC driver that takes in
>> text
>> >> >> that's not SQL, you could write a Flight SQL server that takes in
>> text
>> >> >> that's JSON.
>> >> >> Flight SQL doesn't parse the query, so you could create commands that
>> >> are
>> >> >> just JSON text.
>> >> >>
>> >> >> Is that the only bit you need, Gavin?
>> >> >>
>> >> >> On Thu, Mar 3, 2022 at 4:26 PM Gavin Ray <ray.gavi...@gmail.com>
>> >> wrote:
>> >> >>
>> >> >> > I am enthusiastic about Substrait and have followed it's progress
>> >> eagerly
>> >> >> > =D
>> >> >> >
>> >> >> > When I presented it as a tentative option, there were reservations
>> >> >> because
>> >> >> > of the project/spec being young and the functionality still being
>> >> >> > fleshed out.
>> >> >> > I think if I were having this conversation in say, 8-16 months, it
>> >> would
>> >> >> > have been an easy choice, no doubt.
>> >> >> >
>> >> >> > On a public mailing list (and I can share more details in private
>> if
>> >> >> you're
>> >> >> > curious), the gist of it is this:
>> >> >> >
>> >> >> > Some well-defined/backed-by-mature tech solution for expressing
>> data
>> >> >> > compute operations between services would be a useful thing to have
>> >> >> > (Especially if it's language-agnostic)
>> >> >> >
>> >> >> > The goal is for an "implementing service" to have:
>> >> >> > - An introspectable schema (IE, "describe yourself to me")
>> >> >> > - A query/operation execution endpoint (IE: "perform this operation
>> >> on
>> >> >> your
>> >> >> > data")
>> >> >> >
>> >> >> > With FlightSQL this is possible I believe, but it requires the
>> >> operation
>> >> >> to
>> >> >> > be expressed as a SQL string which isn't ideal.
>> >> >> >
>> >> >> > Working with some programmatic, structured object that has the same
>> >> >> > semantics ("Logical Plan", or whatnot) as a SQL query would have,
>> >> would
>> >> >> be
>> >> >> > a better experience
>> >> >> > (Jacques is on to something here!)
>> >> >> >
>> >> >> > This interface between services would be somewhat the equivalent of
>> >> an
>> >> >> > "SDK", so it would be nice to have a strongly-typed library for
>> >> >> expressing
>> >> >> > and building-up query/data-compute ops.
>> >> >> >
>> >> >> >
>> >> >> > On Thu, Mar 3, 2022 at 3:17 PM David Li <lidav...@apache.org>
>> wrote:
>> >> >> >
>> >> >> > > You probably want Substrait: https://substrait.io/
>> >> >> > >
>> >> >> > > Which is being worked on by several people, including Arrow
>> >> community
>> >> >> > > members.
>> >> >> > >
>> >> >> > > It might be interesting to generalize Flight SQL to include
>> >> support for
>> >> >> > > Substrait. I'm curious what your application, if you're able to
>> >> share
>> >> >> > more.
>> >> >> > >
>> >> >> > > -David
>> >> >> > >
>> >> >> > > On Thu, Mar 3, 2022, at 18:05, Gavin Ray wrote:
>> >> >> > > > Hiya,
>> >> >> > > >
>> >> >> > > > I am drafting a proposal for a way to enable services to
>> express
>> >> data
>> >> >> > > > compute operations to each other.
>> >> >> > > >
>> >> >> > > > However I think it'll be difficult to get buy-in if the only
>> >> >> > > representation
>> >> >> > > > for queries is as SQL strings.
>> >> >> > > >
>> >> >> > > > Is there any kind of lower-level API that can be used to
>> express
>> >> >> > > operations?
>> >> >> > > >
>> >> >> > > > IE instead of "SELECT name FROM user"
>> >> >> > > >
>> >> >> > > > A structured representation like:
>> >> >> > > > {
>> >> >> > > >   "op": "query",
>> >> >> > > >   "schema": "user",
>> >> >> > > >   "project": ["name"]
>> >> >> > > > }
>> >> >> > > >
>> >> >> > > > Or maybe this is a bad idea/doesn't make sense?
>> >> >> > > >
>> >> >> > > > Thank you =)
>> >> >> > >
>> >> >> >
>> >> >>
>> >> >>
>> >> >> --
>> >> >>
>> >> >> *James Duong*
>> >> >> Lead Software Developer
>> >> >> Bit Quill Technologies Inc.
>> >> >> Direct: +1.604.562.6082 | jam...@bitquilltech.com
>> >> >> https://www.bitquilltech.com
>> >> >>
>> >> >> This email message is for the sole use of the intended recipient(s)
>> >> and may
>> >> >> contain confidential and privileged information.  Any unauthorized
>> >> review,
>> >> >> use, disclosure, or distribution is prohibited.  If you are not the
>> >> >> intended recipient, please contact the sender by reply email and
>> >> destroy
>> >> >> all copies of the original message.  Thank you.
>> >> >>
>> >>
>> >
>>
>> --
>>
>> *James Duong*
>> Lead Software Developer
>> Bit Quill Technologies Inc.
>> Direct: +1.604.562.6082 | jam...@bitquilltech.com
>> https://www.bitquilltech.com
>>
>> This email message is for the sole use of the intended recipient(s) and may
>> contain confidential and privileged information.  Any unauthorized review,
>> use, disclosure, or distribution is prohibited.  If you are not the
>> intended recipient, please contact the sender by reply email and destroy
>> all copies of the original message.  Thank you.
>>

Reply via email to