[ https://issues.apache.org/jira/browse/FLINK-18996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703468#comment-17703468 ]
Chalres Tan commented on FLINK-18996: ------------------------------------- +1 [~zicat]. I was pointed to [this design doc|http://goo.gl/VW5Gpd] and https://issues.apache.org/jira/browse/FLINK-6233. In the design doc they mention "Considering that, performing cache cleaning too frequently may affect efficiency. We add a default delay to postpone this process, i.e., {_}minCleanUpInterval = (LSize + RSize) / 2{_}." It seems like the minCleanUpInterval is there to prevent the frequency of cleanup to possibly save compute. I agree with you that delaying cleanup will cause issues for downstream operators and the default should be that minCleanUpInterval = 0. If we cannot remove or change the minCleanUpInterval default value to 0, perhaps we can expose an option to the user to override this value. > Avoid disorder for time interval join > ------------------------------------- > > Key: FLINK-18996 > URL: https://issues.apache.org/jira/browse/FLINK-18996 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Runtime > Reporter: Benchao Li > Priority: Major > Labels: auto-deprioritized-critical, auto-deprioritized-major > Fix For: 1.17.0 > > > Currently, the time interval join will produce data with rowtime later than > watermark. If we use the rowtime again in downstream, e.t. window > aggregation, we'll lose some data. > > reported from user-zh: > [http://apache-flink.147419.n8.nabble.com/Re-flink-interval-join-tc4458.html#none] -- This message was sent by Atlassian Jira (v8.20.10#820010)