[ 
https://issues.apache.org/jira/browse/FLINK-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420638#comment-15420638
 ] 

Gyula Fora commented on FLINK-4391:
-----------------------------------

We have also done something similar at King for handling heavy db operations in 
the following way:

We had a parallel processing operator that could process in both the main 
thread (like now) or in 1 background thread.
there are dedicated processing methods for both and the user code decides for 
every record where to process it. Either in the main thread in a blocking way 
or send it to the background thread queue. 

We also set a size limit on the number of queued elements.

State access and knowing what keys to set in the backend becomes a little 
tricky if we want to keel good performance.

> Provide support for asynchronous operations over streams
> --------------------------------------------------------
>
>                 Key: FLINK-4391
>                 URL: https://issues.apache.org/jira/browse/FLINK-4391
>             Project: Flink
>          Issue Type: New Feature
>          Components: DataStream API
>            Reporter: Jamie Grier
>
> Many Flink users need to do asynchronous processing driven by data from a 
> DataStream.  The classic example would be joining against an external 
> database in order to enrich a stream with extra information.
> It would be nice to add general support for this type of operation in the 
> Flink API.  Ideally this could simply take the form of a new operator that 
> manages async operations, keeps so many of them in flight, and then emits 
> results to downstream operators as the async operations complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to