[ https://issues.apache.org/jira/browse/FLINK-21203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17274388#comment-17274388 ]
wangpeibin commented on FLINK-21203: ------------------------------------ It's my pleasure to work on this. Could you assign it to me? Thanks a lot. > Don’t collect -U&+U Row When they are equals In the LastRowFunction > --------------------------------------------------------------------- > > Key: FLINK-21203 > URL: https://issues.apache.org/jira/browse/FLINK-21203 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Runtime > Reporter: wangpeibin > Priority: Major > > In the LastRowFunction , the -U&+U Row will be collect > Even if they are the same, which will increase calculation pressure of the > next Operator. > > To avoid this, we can optimize the logic of DeduplicateFunctionHelpe. Also, a > config to enable the optimization will be added. > With the sql followed: > {quote}select * from > (select > *, > row_number() over (partition by k order by proctime() desc ) as row_num > from a > ) t > where row_num = 1 > {quote} > Then input 2 row such as : > {quote}Event("B","1","b"), > Event("B","1","b"){quote} > Now the output is: > {quote}(true,+I[B, 1, b, 1]) > (false,-U[B, 1, b, 1]) > (true,+U[B, 1, b, 1]) > {quote} > After the optimization, the output will be: > {quote}(true,+I[B, 1, b, 1]) > {quote} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)