Hi Ajay, Technically, it will immediately block the thread of MyKeyedProcessFunction subtask scheduled to some slot and basically block processing of the key range assigned to this subtask. Practically, I agree with Rong's answer. Depending on the topology of your inputStream, it can eventually block a lot of stuff. In general, I think, it is not recommended to perform blocking operations in process record functions. You could consider AsyncIO [1] to unblock the task thread.
Best, Andrey [1] https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/operators/asyncio.html On Thu, Feb 14, 2019 at 6:03 AM Rong Rong <walter...@gmail.com> wrote: > Hi Ajay, > > Flink handles "backpressure" in a graceful way so that it doesn't get > affected when your processing pipeline is occasionally slowed down. > I think the following articles will help [1,2]. > > In your specific case: the "KeyBy" operation will re-hash data so they can > be reshuffled from all input consumers to all your process operators (in > this case the MyKeyedProcessFunction). If one of the process operator is > backpressured, it will back track all the way to the source. > So, my understanding is that: since there's the reshuffling, if one of the > process function is backpressured, it will potentially affect all the > source operators. > > Thanks, > Rong > > [1] https://www.ververica.com/blog/how-flink-handles-backpressure > [2] > https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/back_pressure.html > > On Wed, Feb 13, 2019 at 8:50 AM Aggarwal, Ajay <ajay.aggar...@netapp.com> > wrote: > >> I was wondering what is the impact if one of the stream operator function >> occasionally takes too long to process the event. Given the following >> simple flink job >> >> >> >> inputStream >> >> .KeyBy (“tenantId”) >> >> .process ( new MyKeyedProcessFunction()) >> >> >> >> , if occasionally MyKeyedProcessFunction takes too long (say ~5-10 >> minutes) to process an incoming element, what is the impact on overall >> pipeline? Is the impact limited to >> >> 1. Specific key for which MyKeyedProcessFunction is currently taking >> too long to process an element, or >> 2. Specific Taskslot, where MyKeyedProcessFunction is currently >> taking too long to process an element, i.e. impacting multiple keys, or >> 3. Entire inputstream ? >> >> >> >> Also what is the built in resiliency in these cases? Is there a concept >> of timeout for each operator function? >> >> >> >> Ajay >> >