[ https://issues.apache.org/jira/browse/FLINK-35826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17866220#comment-17866220 ]
xuyang commented on FLINK-35826: -------------------------------- I believe this is a more general problem, that is "how does an operator determine if a piece of data should be considered as late data". Currently, we determine it on an operator level by comparing with the minimum of multiple input watermarks, but in a distributed system, the watermarks from multiple upstream operators or multiple parallelism of one upstream operator might be delayed. One speculative possibility: could the data discarding be unified in WatermarkAssigner? > [SQL] Sliding window may produce unstable calculations when processing > changelog data. > -------------------------------------------------------------------------------------- > > Key: FLINK-35826 > URL: https://issues.apache.org/jira/browse/FLINK-35826 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner > Affects Versions: 1.20.0 > Environment: flink with release-1.20 > Reporter: Yuan Kui > Assignee: xuyang > Priority: Major > Attachments: image-2024-07-12-14-27-58-061.png > > > Calculation results may be unstable when using a sliding window to process > changelog data. Repeat the execution 10 times, the test results are partial > success and partial failure: > !image-2024-07-12-14-27-58-061.png! > See the documentation and code for more details. > [https://docs.google.com/document/d/1JmwSLs4SJvZKe7kqALqVBZ-1F1OyPmiWw8J6Ug6vqW0/edit?usp=sharing] > code: > [[BUG] Reproduce the issue of unstable sliding window calculation results · > yuchengxin/flink@c003e45 > (github.com)|https://github.com/yuchengxin/flink/commit/c003e45082e0d1464111c286ac9c7abb79527492] -- This message was sent by Atlassian Jira (v8.20.10#820010)