I see, thanks. That sounds reasonable, and indeed if you are subscribing to 
lots of data sources (instead of just trying to maximize throughput as I think 
most people have done before) async may be helpful. There are actually two 
features being described here  though:
 1. The C++ Flight client needs some way to wrap an existing gRPC client, much 
like it can in Java. This may be tricky in C++ (and will require applications 
to carefully manage gRPC versions) and will probably not be possible for Python.
 2. The C++ Flight client should offer async APIs (perhaps callback-based APIs 
or perhaps completion-queue based or something else). 
https://issues.apache.org/jira/browse/ARROW-1009 may be relevant here.
I think gRPC C++ has experimental callback-based async APIs now but it may be 
hard for us to actually take advantage of unless we can draw a hard line on 
minimum required gRPC version. The main question will probably be finding 
someone to do the work.

Best,
David

On Thu, Jun 3, 2021, at 12:11, Nate Bauernfeind wrote:
> In addition to Arrow Flight we have other gRPC APIs that work together as a
> whole. For example, the API client establishes a session with the server.
> Then the client tells the server to create a derivative data stream by
> filtering/joining/computing/etc several source streams. The server will
> keep this derivative stream available as long as the session exists. This
> session is kept alive with a heartbeat. So a small part of this system
> requires a regular (but not overwhelming) heartbeat. If this derivative
> stream has any data source that is live (aka regularly updating) then the
> derivative is also live. The API client would likely want to subscribe to
> the derivative stream, but still needs to heartbeat. The subscription
> (communicated via Flight's DoExchange) should be able to last forever.
> Naturally, this API pattern is simply better suited to the async pattern.
> 
> Ideally, we could re-use the same http2 connections, task queue and/or
> thread-pool between the arrow flight api and the other grpc apis (talking
> to the same endpoint).
> 
> The gRPC callback pattern in Java is nice; it's a bit of a shame that gRPC
> hasn't settled on a c++ callback pattern.
> 
> These are the kinds of ideas that it enables:
> - Subscribe to multiple ticking sources simultaneously.
> - Can have multiple gRPC requests outstanding at the same time
> (particularly useful if server is in a remote data center with high RTT).
> - Can communicate with multiple remote hosts simultaneously.
> - In general, enables the client to be event driven.
> 
> Note that our clients tend to be light; they typically listen to derived
> data and then act based on the information without further computation
> locally.
> 
> The gRPC c++ async library does indeed look a bit under-documented. There
> are a few blog posts that highlight some of the surprises, but async
> behavior is a requirement for what we're working on. (for async-cpp-grpc
> surprises see:
> https://www.gresearch.co.uk/article/lessons-learnt-from-writing-asynchronous-streaming-grpc-services-in-c/
> ).
> 
> Nate
> 
> On Wed, Jun 2, 2021 at 8:44 PM David Li <lidav...@apache.org 
> <mailto:lidavidm%40apache.org>> wrote:
> 
> > Hey Nate,
> >
> > I think there's an open JIRA for something like this. I'd love to have
> > something that plays nicely with asyncio/trio in Python and is hopefully
> > more efficient. (I think it would also let us finally have per-message
> > timeouts instead of only a per-call deadline.) There are some challenges
> > though, e.g. we wouldn't expose gRPC's event loop directly so that we could
> > support other transports, but then that leaves more things to design. I
> > also recall the async C++ APIs being very underdocumented, I get the sense
> > that they aren't actually used except to improve some benchmarks. I'll note
> > for instance gRPC in Python, which offers async support, uses the "core"
> > APIs directly and doesn't use anything C++ offers.
> >
> > But long story short, if you're interested in this I think it would be a
> > useful addition. What sorts of things would it enable for you?
> >
> > -David
> >
> > On Wed, Jun 2, 2021, at 16:20, Nate Bauernfeind wrote:
> > > It seems to me that the c++ arrow flight implementation uses only the
> > > synchronous version of the gRPC API. gRPC supports asynchronous message
> > > delivery in C++ via a CompletionQueue that must be polled. Has there been
> > > any desire to standardize on a solution for asynchronous use cases,
> > perhaps
> > > delivered via a provided CompletionQueue?
> > >
> > > For a simple async grpc c++ example you can look here:
> > >
> > https://github.com/grpc/grpc/blob/master/examples/cpp/helloworld/greeter_async_client.cc
> > >
> > > Thanks,
> > > Nate
> > >
> > > --
> > >
> 
> 
> 
> --
> 

Reply via email to