Yu Xu created CALCITE-7484:
------------------------------

             Summary: Add AggregateFunctionOfGroupByKeysRule to eliminate 
redundant aggregates over GROUP BY keys
                 Key: CALCITE-7484
                 URL: https://issues.apache.org/jira/browse/CALCITE-7484
             Project: Calcite
          Issue Type: Improvement
          Components: core
    Affects Versions: 1.41.0
            Reporter: Yu Xu
            Assignee: Yu Xu
             Fix For: 1.42.0


Sql like:
{code:java}
select sal, max(sal) as sal_max, sum(comm) as comm_sum from emp group by sal, 
deptno; {code}
It should be optimized as follows (the calculation of the aggregate function 
max is redundant):

 
{code:java}
select sal, sal as sal_max, sum(comm) as comm_sum from emp group by sal, 
deptno; {code}
and current plan:

 
{code:java}
LogicalProject(SAL=[$0], SAL_MAX=[$2], COMM_SUM=[$3])
  LogicalAggregate(group=[{0, 1}], SAL_MAX=[MAX($0)], COMM_SUM=[SUM($2)])
    LogicalProject(SAL=[$5], DEPTNO=[$7], COMM=[$6])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
It would be better to optimized to:

 
{code:java}
LogicalProject(SAL=[$0], SAL_MAX=[$2], COMM_SUM=[$3])
  LogicalProject(SAL=[$0], DEPTNO=[$1], SAL0=[$0], COMM_SUM=[$2])
    LogicalAggregate(group=[{0, 1}], COMM_SUM=[SUM($2)])
      LogicalProject(SAL=[$5], DEPTNO=[$7], COMM=[$6])
        LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
As far as I know, similar optimizations exist in some mainstream databases.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to