[ https://issues.apache.org/jira/browse/FLINK-21203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276066#comment-17276066 ]
Jark Wu commented on FLINK-21203: --------------------------------- I think we may not need the configuration. Group Aggregate also has the similar logic when state ttl is not enabled. We may need to enable this optimization only when state ttl is disabled, just like the implementation in https://github.com/apache/flink/blob/f3db4220f5c8730e065734cff16237c7743b390f/flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/runtime/operators/aggregate/GroupAggFunction.java#L170 > Don’t collect -U&+U Row When they are equals In the LastRowFunction > --------------------------------------------------------------------- > > Key: FLINK-21203 > URL: https://issues.apache.org/jira/browse/FLINK-21203 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Runtime > Reporter: wangpeibin > Assignee: wangpeibin > Priority: Major > > In the LastRowFunction , the -U&+U Row will be collected even if they are > the same, which will increase calculation pressure of the next Operator. > > To avoid this, we can optimize the logic of DeduplicateFunctionHelper. Also, > a config to enable the optimization will be added. > With the sql followed: > {quote}select * from > (select > *, > row_number() over (partition by k order by proctime() desc ) as row_num > from a > ) t > where row_num = 1 > {quote} > Then input 2 row such as : > {quote}Event("B","1","b"), > Event("B","1","b") > {quote} > Now the output is: > {quote}(true,+I[B, 1, b, 1]) > (false,-U[B, 1, b, 1]) > (true,+U[B, 1, b, 1]) > {quote} > After the optimization, the output will be: > {quote}(true,+I[B, 1, b, 1]) > {quote} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)