Fabian Hueske created FLINK-5226: ------------------------------------ Summary: Eagerly project unused attributes Key: FLINK-5226 URL: https://issues.apache.org/jira/browse/FLINK-5226 Project: Flink Issue Type: Improvement Components: Table API & SQL Affects Versions: 1.2.0 Reporter: Fabian Hueske
The optimizer does currently not eagerly remove unused attributes. For example given a table {{tab5}} with five attributes {{a, b, c, d, e}}, the following query {code} SELECT x.a, y.b FROM tab5 AS x, tab5 AS y WHERE x.a = y.a {code} would result in the non-optimized plan {code} LogicalProject(a=[$0], b=[$6]) LogicalFilter(condition=[=($0, $5)]) LogicalJoin(condition=[true], joinType=[inner]) LogicalTableScan(table=[[tab5]]) LogicalTableScan(table=[[tab5]]) {code} and the optimized plan: {code} DataSetCalc(select=[a, b0 AS b]) DataSetJoin(where=[=(a, a0)], join=[a, b, c, d, e, a0, b0, c0, d0, e0], joinType=[InnerJoin]) DataSetScan(table=[[_DataSetTable_0]]) DataSetScan(table=[[_DataSetTable_0]]) {code} This plan is inefficient because it joins all ten attributes of both tables instead of eagerly projecting out all unused fields ({{x.b, x.c, x.d, x.e, y.c, y.d, y.e}}). Since this is one of the most common optimizations, I would assume that Calcite provides some rules to extract eager projections. If this is the case, the issue can be solved by adding such rules to {{FlinkRuleSets}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)