Piotr Nowojski created FLINK-37399: -------------------------------------- Summary: Watermark alignment can prevent backlogged jobs from using all available resources Key: FLINK-37399 URL: https://issues.apache.org/jira/browse/FLINK-37399 Project: Flink Issue Type: Improvement Components: API / DataStream Affects Versions: 1.19.2, 1.18.1, 2.0.0 Reporter: Piotr Nowojski Assignee: Piotr Nowojski
Imagine the following scenario. Max allowed watermark alignment drift is set to 30s and watermark alignment is synced/announced across subtasks every 1s. This makes perfect sense during normal records processing. But when processing messages from backlog, it creates a problem, defacto capping how quickly watermarks can be progressing. For example: * at {{t0}} we announce that max allowed watermark is {{max_w0 = min(reported_watermark) + 30s}} * next announcement will happen at {{t1 = t0 + 1s }} * any source/partition exceeds {{max_w0}} until the next announcement happens at {{t1}}, it will be blocked by the watermark alignment In other words, with the above configuration (30s allowed drift announced every ~1s), we are de facto capping the backlog processing speed to: *30 "event time" seconds per each 1 "real world" second* -- This message was sent by Atlassian Jira (v8.20.10#820010)