[ https://issues.apache.org/jira/browse/FLINK-21203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276828#comment-17276828 ]
Jark Wu commented on FLINK-21203: --------------------------------- We should be cautious to introduce any public API, because once it is released, it would be very hard to retract it. I would suggest to not introduce this option for now, util we have received concrete feedbacks from users. > Don’t collect -U&+U Row When they are equals In the LastRowFunction > --------------------------------------------------------------------- > > Key: FLINK-21203 > URL: https://issues.apache.org/jira/browse/FLINK-21203 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Runtime > Reporter: wangpeibin > Assignee: wangpeibin > Priority: Major > > In the LastRowFunction , the -U&+U Row will be collected even if they are > the same, which will increase calculation pressure of the next Operator. > > To avoid this, we can optimize the logic of DeduplicateFunctionHelper. Also, > a config to enable the optimization will be added. > With the sql followed: > {quote}select * from > (select > *, > row_number() over (partition by k order by proctime() desc ) as row_num > from a > ) t > where row_num = 1 > {quote} > Then input 2 row such as : > {quote}Event("B","1","b"), > Event("B","1","b") > {quote} > Now the output is: > {quote}(true,+I[B, 1, b, 1]) > (false,-U[B, 1, b, 1]) > (true,+U[B, 1, b, 1]) > {quote} > After the optimization, the output will be: > {quote}(true,+I[B, 1, b, 1]) > {quote} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)