[ 
https://issues.apache.org/jira/browse/FLINK-21203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangpeibin updated FLINK-21203:
-------------------------------
    Description: 
In the  LastRowFunction , the -U&+U Row will be collected even if they are the 
same, which will  increase calculation pressure of the next Operator.

 

To avoid this, we can optimize the logic of DeduplicateFunctionHelpe. Also, a 
config to enable the optimization will be added.

With the sql followed:
{quote}select * from
 (select
 *,
 row_number() over (partition by k order by proctime() desc ) as row_num
 from a
 ) t
 where row_num = 1
{quote}
Then input 2 row such as :
{quote}Event("B","1","b"),
 Event("B","1","b")
{quote}
Now the output is:
{quote}(true,+I[B, 1, b, 1])
 (false,-U[B, 1, b, 1])
 (true,+U[B, 1, b, 1])
{quote}
After the optimization, the output will be:
{quote}(true,+I[B, 1, b, 1])
{quote}
 

 

  was:
In the  LastRowFunction , the -U&+U Row will be collect 

Even if they are the same, which will  increase calculation pressure of the 
next Operator.

 

To avoid this, we can optimize the logic of DeduplicateFunctionHelpe. Also, a 
config to enable the optimization will be added.

With the sql followed:
{quote}select * from
 (select
 *,
 row_number() over (partition by k order by proctime() desc ) as row_num
 from a
 ) t
 where row_num = 1
{quote}
Then input 2 row such as :
{quote}Event("B","1","b"),
Event("B","1","b"){quote}
Now the output is:
{quote}(true,+I[B, 1, b, 1])
(false,-U[B, 1, b, 1])
(true,+U[B, 1, b, 1])
{quote}
After the optimization, the output will be:
{quote}(true,+I[B, 1, b, 1])
{quote}
 

 


>  Don’t collect -U&+U Row When they are equals In the LastRowFunction 
> ---------------------------------------------------------------------
>
>                 Key: FLINK-21203
>                 URL: https://issues.apache.org/jira/browse/FLINK-21203
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / Runtime
>            Reporter: wangpeibin
>            Priority: Major
>
> In the  LastRowFunction , the -U&+U Row will be collected even if they are 
> the same, which will  increase calculation pressure of the next Operator.
>  
> To avoid this, we can optimize the logic of DeduplicateFunctionHelpe. Also, a 
> config to enable the optimization will be added.
> With the sql followed:
> {quote}select * from
>  (select
>  *,
>  row_number() over (partition by k order by proctime() desc ) as row_num
>  from a
>  ) t
>  where row_num = 1
> {quote}
> Then input 2 row such as :
> {quote}Event("B","1","b"),
>  Event("B","1","b")
> {quote}
> Now the output is:
> {quote}(true,+I[B, 1, b, 1])
>  (false,-U[B, 1, b, 1])
>  (true,+U[B, 1, b, 1])
> {quote}
> After the optimization, the output will be:
> {quote}(true,+I[B, 1, b, 1])
> {quote}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to