[ https://issues.apache.org/jira/browse/FLINK-21203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
wangpeibin updated FLINK-21203: ------------------------------- Description: In the LastRowFunction , the -U&+U Row will be collect Even if they are the same, which will increase calculation pressure of the next Operator. To avoid this, we can optimize the logic of DeduplicateFunctionHelpe. Also, a config to enable the optimization will be added. With the sql followed: {quote}select * from (select *, row_number() over (partition by k order by proctime() desc ) as row_num from a ) t where row_num = 1 {quote} Then input 2 row such as : {quote}Event("B","1","b"), Event("B","1","b"){quote} Now the output is: {quote}(true,+I[B, 1, b, 1]) (false,-U[B, 1, b, 1]) (true,+U[B, 1, b, 1]) {quote} After the optimization, the output will be: {quote}(true,+I[B, 1, b, 1]) {quote} was: In the LastRowFunction , the -U&+U Row will be collect Even if they are the same, which will increase calculation pressure of the next Operator. > Don’t collect -U&+U Row When they are equals In the LastRowFunction > --------------------------------------------------------------------- > > Key: FLINK-21203 > URL: https://issues.apache.org/jira/browse/FLINK-21203 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Runtime > Reporter: wangpeibin > Priority: Major > > In the LastRowFunction , the -U&+U Row will be collect > Even if they are the same, which will increase calculation pressure of the > next Operator. > > To avoid this, we can optimize the logic of DeduplicateFunctionHelpe. Also, a > config to enable the optimization will be added. > With the sql followed: > {quote}select * from > (select > *, > row_number() over (partition by k order by proctime() desc ) as row_num > from a > ) t > where row_num = 1 > {quote} > Then input 2 row such as : > {quote}Event("B","1","b"), > Event("B","1","b"){quote} > Now the output is: > {quote}(true,+I[B, 1, b, 1]) > (false,-U[B, 1, b, 1]) > (true,+U[B, 1, b, 1]) > {quote} > After the optimization, the output will be: > {quote}(true,+I[B, 1, b, 1]) > {quote} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)