I think we could probably expose the oneof behavior without exposing the
protobuf functions. On the any... hmm. I guess we could expose as two
fields: type and data. Then users could use it for whatever but if people
wanted to treat it as any, it would work. (Basically a user could use any
with it easily but they could also use any other mechanism). At least in
java, the any concepts are pretty simple/diy. Are other language bindings
less diy?

I'm *not* hardcore against the empty FlightData + metadata but it just
seemed a bit janky.

Thinking about the control message/wrapper object thing, I wonder if we
should redefine DoPut and DoGet to have the same property if we think it is
a good idea...

On Wed, Oct 16, 2019 at 5:13 PM David Li <li.david...@gmail.com> wrote:

> I was definitely considering having control messages without data, and
> I thought that could be encoded by a FlightData with only app_metadata
> set. I think I understand your position now: FlightData should always
> carry (some) data (with optional metadata)?
>
> That makes sense to me, and is consistent with the documentation on
> FlightData in the Protobuf file. I was worried about having a
> redundant metadata field, but oneof prevents that from happening, and
> overall having a clear separation between data and control messages is
> cleaner.
>
> As for using Protobuf's Any: so far, we've refrained from exposing
> Protobuf by using bytes, would we want to change that now?
>
> Best,
> David
>
> On 10/16/19, Jacques Nadeau <jacq...@apache.org> wrote:
> > Hey David,
> >
> > RE: Async: I was trying to match the pattern we use for doget/doput for
> > async. Yes, more thinking java given java grpc's async always pattern.
> >
> > On the comment around the FlightData, I think it is overloading the
> message
> > to use metadata for this. If I want to send a control message
> independently
> > of the data message, I would have to define something like an empty
> flight
> > data message that has custom metadata. Why not support a container object
> > with a oneof{FlightData, Any} in it instead so users can add more data as
> > desired. The default impl could be a noop for the Any messages.
> >
> > On Tue, Oct 15, 2019 at 6:50 PM David Li <li.david...@gmail.com> wrote:
> >
> >> Hi Jacques,
> >>
> >> Thanks for the comments.
> >>
> >> - I do agree DoExchange is a better name!
> >> - FlightData already has metadata fields as a result of prior
> >> proposals, so I don't think we need a new message to carry that kind
> >> of information.
> >> - I like the suggestion of an async handler to handle incoming
> >> messages as the fundamental API; it would actually be quite natural to
> >> implement in Flight/Java. I will note that it's not possible in
> >> C++/Python without spawning a thread, though. (In essence, gRPC-Java
> >> is async-always and gRPC-C++ is sync-always.) There are experimental
> >> C++ APIs that would let us do something similar to Java, but those are
> >> only in relatively recent gRPC versions and are still under
> >> development (contrary to the interceptor APIs which have been around
> >> for quite a while).
> >>
> >> Thanks,
> >> David
> >>
> >> On 10/15/19, Jacques Nadeau <jacq...@apache.org> wrote:
> >> > I like it. Added some comments to the doc. Might worth discussion here
> >> > depending on your thoughts.
> >> >
> >> > On Tue, Oct 15, 2019 at 7:11 AM David Li <li.david...@gmail.com>
> wrote:
> >> >
> >> >> Hey Ryan,
> >> >>
> >> >> Thanks for the comments.
> >> >>
> >> >> Concrete example: I've edited the doc to provide a Python strawman.
> >> >>
> >> >> Sync vs async: while I don't touch on it, you could interleave
> uploads
> >> >> and downloads if you were so inclined. Right now, synchronous APIs
> >> >> make this error-prone, e.g. if both client and server wait for each
> >> >> other due to an application logic bug. (gRPC doesn't give us the
> >> >> ability to have per-read timeouts, only an overall timeout.) As an
> >> >> example of this happening with DoPut, see ARROW-6063:
> >> >> https://issues.apache.org/jira/browse/ARROW-6063
> >> >>
> >> >> This is mostly tangential though, eventually we will want to design
> >> >> asynchronous APIs for Flight as a whole. A bidirectional stream like
> >> >> this (and like DoPut) just makes these pitfalls easier to run into.
> >> >>
> >> >> Using DoPut+DoGet: I discussed this in the proposal, but the main
> >> >> concern is that depending on how you deploy, two separate calls could
> >> >> get routed to different instances. Additionally, gRPC has some
> >> >> reconnection behaviors; if the server goes away in between the two
> >> >> calls, but it then restarts or there is another instance available,
> >> >> the client will happily reconnect to the new server without warning.
> >> >>
> >> >> Thanks,
> >> >> David
> >> >>
> >> >> On 10/15/19, Ryan Murray <rym...@dremio.com> wrote:
> >> >> > Hey David,
> >> >> >
> >> >> > I think this proposal makes a lot of sense. I like it and the
> >> >> > possibility
> >> >> > of remote compute via arrow buffers. One thing that would help me
> >> would
> >> >> be
> >> >> > a concrete example of the API in a real life use case. Also, what
> >> would
> >> >> the
> >> >> > client experience be in terms of sync vs asyc? Would the client
> >> >> > block
> >> >> till
> >> >> > the bidirectional call return ie c = flight.vector_mult(a, b) or
> >> >> > would
> >> >> the
> >> >> > client wait to be signaled that computation was done. If the later
> >> >> > how
> >> >> > is
> >> >> > that different from a DoPut then DoGet? I suppose that this could
> be
> >> >> > implemented without extending the RPC interface but rather by a
> >> >> > function/util?
> >> >> >
> >> >> >
> >> >> > Best,
> >> >> >
> >> >> > Ryan
> >> >> >
> >> >> > On Sun, Oct 13, 2019 at 9:24 PM David Li <li.david...@gmail.com>
> >> wrote:
> >> >> >
> >> >> >> Hi all,
> >> >> >>
> >> >> >> We've been using Flight quite successfully so far, but we have
> >> >> >> identified a new use case on the horizon: being able to both send
> >> >> >> and
> >> >> >> retrieve Arrow data within a single RPC call. To that end, I've
> >> >> >> written up a proposal for a new RPC method:
> >> >> >>
> >> >> >>
> >> >>
> >>
> https://docs.google.com/document/d/1Hh-3Z0hK5PxyEYFxwVxp77jens3yAgC_cpp0TGW-dcw/edit?usp=sharing
> >> >> >>
> >> >> >> Please let me know if you can't view or comment on the document.
> >> >> >> I'd
> >> >> >> appreciate any feedback; I think this is a relatively
> >> >> >> straightforward
> >> >> >> addition - it is essentially "DoPutThenGet".
> >> >> >>
> >> >> >> This is a format change and would require a vote. I've decided to
> >> >> >> table the other format change I had proposed (on DoPut), as it
> >> doesn't
> >> >> >> functionally change Flight, just the interpretation of the
> >> >> >> semantics.
> >> >> >>
> >> >> >> Thanks,
> >> >> >> David
> >> >> >>
> >> >> >
> >> >> >
> >> >> > --
> >> >> >
> >> >> > Ryan Murray  | Principal Consulting Engineer
> >> >> >
> >> >> > +447540852009 | rym...@dremio.com
> >> >> >
> >> >> > <https://www.dremio.com/>
> >> >> > Check out our GitHub <https://www.github.com/dremio>, join our
> >> >> > community
> >> >> > site <https://community.dremio.com/> & Download Dremio
> >> >> > <https://www.dremio.com/download>
> >> >> >
> >> >>
> >> >
> >>
> >
>

Reply via email to