Backpressure and 99th percentile latency

Felipe Gutierrez Thu, 05 Mar 2020 13:05:23 -0800

Hi,

I am a bit confused about the topic of tracking latency in Flink [1]. It
says if I use the latency track I am measuring the Flink’s network stack
but application code latencies also can influence it. For instance, if I am
using the metrics.latency.granularity: operator (default) and
setLatencyTrackingInterval(10000). I understand that I am tracking latency
every 10 seconds for each physical instance operator. Is that right?


In my application, I am tracking the latency of all aggregators. When I
have a high workload and I can see backpressure from the flink UI the 99th
percentile latency is 13, 25, 21, and 25 seconds. Then I set my aggregator
to have a larger window. The backpressure goes away but the 99th percentile
latency is still the same. Why? Does it have no relation with each other?

In the end I left the experiment for more than 2 hours running and only
after about 1,5 hour the 99th percentile latency got down to milliseconds.
Is that normal? Please see the figure attached.

[1]
https://flink.apache.org/2019/07/23/flink-network-stack-2.html#latency-tracking

Thanks,
Felipe
*--*
*-- Felipe Gutierrez*

*-- skype: felipe.o.gutierrez*
*--* *https://felipeogutierrez.blogspot.com
<https://felipeogutierrez.blogspot.com>*

Backpressure and 99th percentile latency

Reply via email to