Greetings, When playing around with the following simple event-time stream aggregation:
SELECT TUMBLE_START(event_time, INTERVAL '1' DAY) as event_time, ... FROM input GROUP BY TUMBLE(event_time, INTERVAL '1' DAY), symbol ...to my surprise I found out that the tumbling window operator has no effect on the watermarks of the resulting append stream - the watermarks of the input stream are propagated as-is. This seems to be a documented behavior https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html#interaction-of-watermarks-and-windows but it's still very counter-intuitive to me and I couldn't find any explanation of it. My understanding of the watermarking is that MOST data is expected to arrive with event time below the stream's watermark. Late events are either discarded or should be handled as exceptional cases, e.g. via "allowed lateness". So in my aggregation above I was expecting the result watermark to be offset by ~1 day from the input and be emitted only after a tumbling window closes. Instead, with input watermarks propagated as-is ALL events in the resulting stream end up being late in relation to the current watermark... Doesn't this behavior ruin the composition, as downstream operators will be discarding all late data? I'd greatly appreciate if someone could shed light on this design decision. Thanks, Sergii