[jira] [Commented] (FLINK-5266) Eagerly project unused fields when selecting aggregation fields

ASF GitHub Bot (JIRA) Wed, 07 Dec 2016 18:48:26 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-5266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15730904#comment-15730904
 ]


ASF GitHub Bot commented on FLINK-5266:
---------------------------------------

GitHub user KurtYoung opened a pull request:

    https://github.com/apache/flink/pull/2961

    [FLINK-5266] [table] eagerly project unused fields when selecting 
aggregation fields

    This PR is based on #2926 , only the second commit is related.
    
    I add a "plan" test directory to hold all the plan level tests. And i also 
did a small refactory for ProjectionTranslator, thought it's better to keep 
each method only do one thing.
    
    @fhueske As we discussed earlier in the jira: 
https://issues.apache.org/jira/browse/FLINK-5266 about where the logics should 
be added. I decided to add them when we selecting fields from a normal or 
grouped table. Since this kind of logics involves some fields references 
rewrite, if we choose to add the needed projection node when we convert the 
LogicalPlan to Calcite's RelNode, we should also take care the whole rewrite 
thing. 
    
    However, if we add the project node in the first place, we only need to 
extract all the field references used in all selecting expressions, and treat 
them as UnresolvedFieldReferences. The validation part will take care of the 
rewrite thing. I think this will be easier and more consistent with other 
procedures. (Noticed all the "construct" logic are fairly simple)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/KurtYoung/flink flink-5266

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2961.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2961
    
----
commit 374d231d44f84ae385d8f8adb2353685e1214ff6
Author: Kurt Young <ykt...@gmail.com>
Date:   2016-12-08T01:27:55Z

    [FLINK-5226] [table] Use correct DataSetCostFactory and improve DataSetCalc 
costs.

commit 8a3ecf8e6362acd9370b11f08018a0143fc9be18
Author: Kurt Young <ykt...@gmail.com>
Date:   2016-12-08T02:35:43Z

    [FLINK-5266] [table] eagerly project unused fields when selecting 
aggregation fields
    
    Add a "plan" test dir to hold all the plan level unit tests
    
    Small refactory with ProjectionTranslator, keep each method handle one 
single thing

----


> Eagerly project unused fields when selecting aggregation fields
> ---------------------------------------------------------------
>
>                 Key: FLINK-5266
>                 URL: https://issues.apache.org/jira/browse/FLINK-5266
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table API & SQL
>            Reporter: Kurt Young
>            Assignee: Kurt Young
>
> When we call table's {{select}} method and if it contains some aggregations, 
> we will project fields after the aggregation. Would be better to project 
> unused fields before the aggregation, and can furthermore leave the 
> opportunity to push the project into scan.
> For example, the current logical plan of a simple query:
> {code}
> table.select('a.sum as 's, 'a.max)
> {code}
> is
> {code}
> LogicalProject(s=[$0], TMP_2=[$1])
>   LogicalAggregate(group=[{}], TMP_0=[SUM($5)], TMP_1=[MAX($5)])
>     LogicalTableScan(table=[[supplier]])
> {code}
> Would be better if we can project unused fields right after scan, and looks 
> like this:
> {code}
> LogicalProject(s=[$0], EXPR$1=[$0])
>   LogicalAggregate(group=[{}], EXPR$1=[SUM($0)])
>     LogicalProject(a=[$5])
>       LogicalTableScan(table=[[supplier]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-5266) Eagerly project unused fields when selecting aggregation fields

Reply via email to