Hi Ajay,

Technically, it will immediately block the thread of MyKeyedProcessFunction
subtask scheduled to some slot and basically block processing of the key
range assigned to this subtask.
Practically, I agree with Rong's answer. Depending on the topology of your
inputStream, it can eventually block a lot of stuff.
In general, I think, it is not recommended to perform blocking operations
in process record functions. You could consider AsyncIO [1] to unblock the
task thread.

Best,
Andrey

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/operators/asyncio.html

On Thu, Feb 14, 2019 at 6:03 AM Rong Rong <walter...@gmail.com> wrote:

> Hi Ajay,
>
> Flink handles "backpressure" in a graceful way so that it doesn't get
> affected when your processing pipeline is occasionally slowed down.
> I think the following articles will help [1,2].
>
> In your specific case: the "KeyBy" operation will re-hash data so they can
> be reshuffled from all input consumers to all your process operators (in
> this case the MyKeyedProcessFunction). If one of the process operator is
> backpressured, it will back track all the way to the source.
> So, my understanding is that: since there's the reshuffling, if one of the
> process function is backpressured, it will potentially affect all the
> source operators.
>
> Thanks,
> Rong
>
> [1] https://www.ververica.com/blog/how-flink-handles-backpressure
> [2]
> https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/back_pressure.html
>
> On Wed, Feb 13, 2019 at 8:50 AM Aggarwal, Ajay <ajay.aggar...@netapp.com>
> wrote:
>
>> I was wondering what is the impact if one of the stream operator function
>> occasionally takes too long to process the event.  Given the following
>> simple flink job
>>
>>
>>
>>        inputStream
>>
>>           .KeyBy (“tenantId”)
>>
>>           .process ( new MyKeyedProcessFunction())
>>
>>
>>
>> , if occasionally MyKeyedProcessFunction takes too long (say ~5-10
>> minutes) to process an incoming element, what is the impact on overall
>> pipeline? Is the impact limited to
>>
>>    1. Specific key for which MyKeyedProcessFunction is currently taking
>>    too long to process an element, or
>>    2. Specific Taskslot, where MyKeyedProcessFunction is currently
>>    taking too long to process an element, i.e. impacting multiple keys, or
>>    3. Entire inputstream ?
>>
>>
>>
>> Also what is the built in resiliency in these cases? Is there a concept
>> of timeout for each operator function?
>>
>>
>>
>> Ajay
>>
>

Reply via email to