Hey David, RE: Async: I was trying to match the pattern we use for doget/doput for async. Yes, more thinking java given java grpc's async always pattern.
On the comment around the FlightData, I think it is overloading the message to use metadata for this. If I want to send a control message independently of the data message, I would have to define something like an empty flight data message that has custom metadata. Why not support a container object with a oneof{FlightData, Any} in it instead so users can add more data as desired. The default impl could be a noop for the Any messages. On Tue, Oct 15, 2019 at 6:50 PM David Li <li.david...@gmail.com> wrote: > Hi Jacques, > > Thanks for the comments. > > - I do agree DoExchange is a better name! > - FlightData already has metadata fields as a result of prior > proposals, so I don't think we need a new message to carry that kind > of information. > - I like the suggestion of an async handler to handle incoming > messages as the fundamental API; it would actually be quite natural to > implement in Flight/Java. I will note that it's not possible in > C++/Python without spawning a thread, though. (In essence, gRPC-Java > is async-always and gRPC-C++ is sync-always.) There are experimental > C++ APIs that would let us do something similar to Java, but those are > only in relatively recent gRPC versions and are still under > development (contrary to the interceptor APIs which have been around > for quite a while). > > Thanks, > David > > On 10/15/19, Jacques Nadeau <jacq...@apache.org> wrote: > > I like it. Added some comments to the doc. Might worth discussion here > > depending on your thoughts. > > > > On Tue, Oct 15, 2019 at 7:11 AM David Li <li.david...@gmail.com> wrote: > > > >> Hey Ryan, > >> > >> Thanks for the comments. > >> > >> Concrete example: I've edited the doc to provide a Python strawman. > >> > >> Sync vs async: while I don't touch on it, you could interleave uploads > >> and downloads if you were so inclined. Right now, synchronous APIs > >> make this error-prone, e.g. if both client and server wait for each > >> other due to an application logic bug. (gRPC doesn't give us the > >> ability to have per-read timeouts, only an overall timeout.) As an > >> example of this happening with DoPut, see ARROW-6063: > >> https://issues.apache.org/jira/browse/ARROW-6063 > >> > >> This is mostly tangential though, eventually we will want to design > >> asynchronous APIs for Flight as a whole. A bidirectional stream like > >> this (and like DoPut) just makes these pitfalls easier to run into. > >> > >> Using DoPut+DoGet: I discussed this in the proposal, but the main > >> concern is that depending on how you deploy, two separate calls could > >> get routed to different instances. Additionally, gRPC has some > >> reconnection behaviors; if the server goes away in between the two > >> calls, but it then restarts or there is another instance available, > >> the client will happily reconnect to the new server without warning. > >> > >> Thanks, > >> David > >> > >> On 10/15/19, Ryan Murray <rym...@dremio.com> wrote: > >> > Hey David, > >> > > >> > I think this proposal makes a lot of sense. I like it and the > >> > possibility > >> > of remote compute via arrow buffers. One thing that would help me > would > >> be > >> > a concrete example of the API in a real life use case. Also, what > would > >> the > >> > client experience be in terms of sync vs asyc? Would the client block > >> till > >> > the bidirectional call return ie c = flight.vector_mult(a, b) or would > >> the > >> > client wait to be signaled that computation was done. If the later how > >> > is > >> > that different from a DoPut then DoGet? I suppose that this could be > >> > implemented without extending the RPC interface but rather by a > >> > function/util? > >> > > >> > > >> > Best, > >> > > >> > Ryan > >> > > >> > On Sun, Oct 13, 2019 at 9:24 PM David Li <li.david...@gmail.com> > wrote: > >> > > >> >> Hi all, > >> >> > >> >> We've been using Flight quite successfully so far, but we have > >> >> identified a new use case on the horizon: being able to both send and > >> >> retrieve Arrow data within a single RPC call. To that end, I've > >> >> written up a proposal for a new RPC method: > >> >> > >> >> > >> > https://docs.google.com/document/d/1Hh-3Z0hK5PxyEYFxwVxp77jens3yAgC_cpp0TGW-dcw/edit?usp=sharing > >> >> > >> >> Please let me know if you can't view or comment on the document. I'd > >> >> appreciate any feedback; I think this is a relatively straightforward > >> >> addition - it is essentially "DoPutThenGet". > >> >> > >> >> This is a format change and would require a vote. I've decided to > >> >> table the other format change I had proposed (on DoPut), as it > doesn't > >> >> functionally change Flight, just the interpretation of the semantics. > >> >> > >> >> Thanks, > >> >> David > >> >> > >> > > >> > > >> > -- > >> > > >> > Ryan Murray | Principal Consulting Engineer > >> > > >> > +447540852009 | rym...@dremio.com > >> > > >> > <https://www.dremio.com/> > >> > Check out our GitHub <https://www.github.com/dremio>, join our > >> > community > >> > site <https://community.dremio.com/> & Download Dremio > >> > <https://www.dremio.com/download> > >> > > >> > > >