[ https://issues.apache.org/jira/browse/FLINK-8144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16265213#comment-16265213 ]
ASF GitHub Bot commented on FLINK-8144: --------------------------------------- Github user dianfu commented on a diff in the pull request: https://github.com/apache/flink/pull/5063#discussion_r152955709 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/aggregate/RowTimeUnboundedOver.scala --- @@ -116,7 +116,7 @@ abstract class RowTimeUnboundedOver( // discard late record if (timestamp > curWatermark) { // ensure every key just registers one timer - ctx.timerService.registerEventTimeTimer(curWatermark + 1) + ctx.timerService.registerEventTimeTimer(timestamp) --- End diff -- @fhueske Thanks a lot for your comments. Your concern makes sense to me. I think the current implementation is ok under periodic watermark. But I'm not sure if it's optimal under punctuated watermark. We will perform some performance test for unbounded over under punctuated watermark and share the results. > Optimize the timer logic in RowTimeUnboundedOver > ------------------------------------------------ > > Key: FLINK-8144 > URL: https://issues.apache.org/jira/browse/FLINK-8144 > Project: Flink > Issue Type: Bug > Components: Table API & SQL > Reporter: Dian Fu > Assignee: Dian Fu > Fix For: 1.5.0 > > > Currently the logic of {{RowTimeUnboundedOver}} is as follows: > 1) When element comes, buffer it in MapState and and register a timer at > {{current watermark + 1}} > 2) When event timer triggered, scan the MapState and find the elements below > the current watermark and process it. If there are remaining elements to > process, register a new timer at {{current watermark + 1}}. > Let's assume that watermark comes about 5 seconds later than the event on > average, then we will scan about 5000 times the MapState before actually > processing the events. -- This message was sent by Atlassian JIRA (v6.4.14#64029)