Piotr Nowojski created FLINK-37399:
--------------------------------------
Summary: Watermark alignment can prevent backlogged jobs from
using all available resources
Key: FLINK-37399
URL: https://issues.apache.org/jira/browse/FLINK-37399
Project: Flink
Issue Type: Improvement
Components: API / DataStream
Affects Versions: 1.19.2, 1.18.1, 2.0.0
Reporter: Piotr Nowojski
Assignee: Piotr Nowojski
Imagine the following scenario. Max allowed watermark alignment drift is set to
30s and watermark alignment is synced/announced across subtasks every 1s. This
makes perfect sense during normal records processing.
But when processing messages from backlog, it creates a problem, defacto
capping how quickly watermarks can be progressing. For example:
* at {{t0}} we announce that max allowed watermark is {{max_w0 =
min(reported_watermark) + 30s}}
* next announcement will happen at {{t1 = t0 + 1s }}
* any source/partition exceeds {{max_w0}} until the next announcement happens
at {{t1}}, it will be blocked by the watermark alignment
In other words, with the above configuration (30s allowed drift announced every
~1s), we are de facto capping the backlog processing speed to:
*30 "event time" seconds per each 1 "real world" second*
--
This message was sent by Atlassian Jira
(v8.20.10#820010)