Hi, I am a bit confused about the topic of tracking latency in Flink [1]. It says if I use the latency track I am measuring the Flinkās network stack but application code latencies also can influence it. For instance, if I am using the metrics.latency.granularity: operator (default) and setLatencyTrackingInterval(10000). I understand that I am tracking latency every 10 seconds for each physical instance operator. Is that right?
In my application, I am tracking the latency of all aggregators. When I have a high workload and I can see backpressure from the flink UI the 99th percentile latency is 13, 25, 21, and 25 seconds. Then I set my aggregator to have a larger window. The backpressure goes away but the 99th percentile latency is still the same. Why? Does it have no relation with each other? In the end I left the experiment for more than 2 hours running and only after about 1,5 hour the 99th percentile latency got down to milliseconds. Is that normal? Please see the figure attached. [1] https://flink.apache.org/2019/07/23/flink-network-stack-2.html#latency-tracking Thanks, Felipe *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>*