[ https://issues.apache.org/jira/browse/FLINK-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16771321#comment-16771321 ]
Rong Rong commented on FLINK-7001: ---------------------------------- Thanks [~jark] for the response. Yes, the Blink implementation of the WindowOperator is very similar to the first POC in the original documentation (currently moved to Appendix: https://docs.google.com/document/d/1ziVsuW_HQnvJr_4a9yKwx_LEnhVkdlde2Z5l6sx5HlY/edit?ts=5c6a613e#). We also found that in some of the scenarios, the performance was actually worse and we cannot find a general one-fits-for-all solution to the sliding window operator. Thus we proposed the 2nd POC to have the Operator to be split into slicing and merging state. Also thanks for the great comments in the doc, I will follow up with the doc comments and the discussions in the mailing list as well as in the new JIRA ticket https://issues.apache.org/jira/browse/FLINK-11276 which has broader scope than just addressing performance issue in sliding windows. Thanks - Rong > Improve performance of Sliding Time Window with pane optimization > ----------------------------------------------------------------- > > Key: FLINK-7001 > URL: https://issues.apache.org/jira/browse/FLINK-7001 > Project: Flink > Issue Type: Improvement > Components: DataStream API > Reporter: Jark Wu > Assignee: Jark Wu > Priority: Major > > Currently, the implementation of time-based sliding windows treats each > window individually and replicates records to each window. For a window of 10 > minute size that slides by 1 second the data is replicated 600 fold (10 > minutes / 1 second). We can optimize sliding window by divide windows into > panes (aligned with slide), so that we can avoid record duplication and > leverage the checkpoint. > I will attach a more detail design doc to the issue. > The following issues are similar to this issue: FLINK-5387, FLINK-6990 -- This message was sent by Atlassian JIRA (v7.6.3#76005)