[jira] [Commented] (FLINK-23555) Improve common subexpression elimination by using local references

Enze Liu (Jira) Mon, 30 Aug 2021 00:58:05 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-23555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406607#comment-17406607
 ]


Enze Liu commented on FLINK-23555:
----------------------------------

The description has updated. Please help check again. Thanks.

Our approach is to keep record of the local reference. And implement the 
\{{visitLocalRef}} in \{{ExprCodeGenerator}}.

We can come up with the pr if needed.

> Improve common subexpression elimination by using local references
> ------------------------------------------------------------------
>
>                 Key: FLINK-23555
>                 URL: https://issues.apache.org/jira/browse/FLINK-23555
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / Planner
>            Reporter: weibowen
>            Priority: Major
>             Fix For: 1.14.0
>
>         Attachments: performance_after_optimization.png, 
> performance_before_optimization.png, sql.png, udf.png
>
>
> When we write a sql like 
> {code:java}
> select udf2(udf1(field), udf3(udf1(field) ...{code}
> udf1(field) will be invoked twice. However once udf1 has a bad performance, 
> it will have a huge impact to the whole task. More times invoked, huger 
> impact.
> I hope that whatever how many times udf1(field) writed in sql, Flink will 
> take advantage of common subexpression elimination and only invoke it once.
> Then i do some work on this, and the attachment tells the result.
>  
> The sql.png shows the sql logic,
> !sql.png!
> and i read source from kafka and sink into blackhole. The parallelism is 1.
> The udf `testcse` do nothing except sleeping 20 milliseconds, while the udf 
> `testcse2`, `testcse3` and `testcse4` are the same udf with different alias 
> which completely do nothing.
> !udf.png!
> As expected, the performance after optimization is approximately 3 times than 
> before since I write `testcse(sid)` 3 times in sql.
> before:
> !performance_before_optimization.png!
> after:
> !performance_after_optimization.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-23555) Improve common subexpression elimination by using local references

Reply via email to