[ 
https://issues.apache.org/jira/browse/CALCITE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17956178#comment-17956178
 ] 

Ian Bertolacci commented on CALCITE-5568:
-----------------------------------------

Is there any plan on fixing this? 
Our runtime requires that all Correlate nodes have been decorrelated, and we 
are encountering this in some trivial cases.

Specifically, we are encountering it when a statically false filter causes the 
RelBuilder to eliminate the left side of the correlation.
For example:
{code:sql}
select (select sum(C1) from Test as TestInner1 where TestInner1.id= Test.id) 
from Touter 
where (select sum(C1) from Test as TestInner2 where TestInner2.id= Test.id) > 0 
and false
{code}
The values node comes from FILTER_SUB_QUERY_TO_CORRELATE, which rebuilds a 
filter having the subquery and the false expression, causing the RelBuilder to 
invoke `empty()` for the construction of that side

I was able to replicate this with a RelBuilder with a number of approaches:
Exactly how it would be constructed normally:
{code:scala}
b.scan("Test")
  .variable { (cor0: RexCorrelVariable) =>
    b.scan("Test")
      .filter(b.call(COps.EQUALS, b.field("ID"), b.dot(cor0, "ID")))
      .aggregate(b.groupKey(), b.sum(b.field("C1")).as("_sum"))
      .rename(ImmutableList.of("_sum"))
      .correlate(JoinRelType.LEFT, cor0.id, b.field(2, 0, "ID"))
      .filter(b.call(COps.GREATER_THAN, b.field("_sum"), b.literal(0)), 
b.literal(false))
  }
  .variable { (cor1: RexCorrelVariable) =>
    b.scan("Test")
      .filter(b.call(COps.EQUALS, b.field("ID"), b.dot(cor1, "ID")))
      .aggregate(b.groupKey(), b.sum(b.field("C1")).as("_sum"))
      .correlate(JoinRelType.LEFT, cor1.id, b.field(2, 0, "ID"))
      .project(b.field("_sum"))
  }
  .build()
{code}

Preemptively eliminating the cor0 side (which is an important fact because it 
shows that this is not because cor0 is somehow left-over after the `empty()` 
call):
{code:scala}
b.values(rowTypeForElidedNode)
  .variable { (cor1: RexCorrelVariable) =>
    b.scan("Test")
      .filter(b.call(COps.EQUALS, b.field("ID"), b.dot(cor1, "ID")))
      .aggregate(b.groupKey(), b.sum(b.field("C1")).as("_sum"))
      .correlate(JoinRelType.LEFT, cor1.id, b.field(2, 0, "ID"))

  }
  .project(b.field("_sum"))
  .build()
{code}

Having a filter which is still statically false, but does not have any 
correlated subquery:
{code:scala}
b.scan("Test")
  .filter(b.literal(false))
  .variable { (cor1: RexCorrelVariable) =>
    b.scan("Test")
      .filter(b.call(COps.EQUALS, b.field("ID"), b.dot(cor1, "ID")))
      .aggregate(b.groupKey(), b.sum(b.field("C1")).as("_sum"))
      .correlate(JoinRelType.LEFT, cor1.id, b.field(2, 0, "ID"))
  }
  .project(b.field("_sum"))
  .build()
{code}

In all of these cases, RelDecorrelator does not decorrelate the resulting 
Correlation on cor1.
For the first and last case, not having a statically false filter allows 
Decorrelation to proceed.

In all these cases, the RelNode looks something like this immediately before  
decorrelation:
{code}
55:Project(_sum=[$8])
└── 54:Correlate(correlation=[$cor0], joinType=[left], requiredColumns=[{0}])
    ├── 50:ValuesWithSpartaTypes(tuples=[[]], Schema=[Long, Decimal(0, 38), 
Long, Text, {Dimension}, Long, Long, Long])
    └── 53:Aggregate(group=[{}], _sum=[SUM($1)])
        └── 52:Filter(condition=[=($0, $cor0.ID)])
            └── 51:TableScan(table=[Test], Schema=[ID:Dimension, C1:Decimal(0, 
38), C2:Dimension, C3:Text, C4:{Dimension}, C5:Timestamp, C6:Dimension, 
C7:Long])
{code}
In the SQL case, there are some extra projections which do some renaming and 
column alignment

> Decorrelate will fail. If the RelNode tree has LogicalValues
> ------------------------------------------------------------
>
>                 Key: CALCITE-5568
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5568
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: libopeng
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Decorrelate will fail. If the RelNode tree has LogicalValues
> case:
> {code:java}
> SELECT *
> FROM emp AS e
> CROSS JOIN (VALUES 1) AS t (x)
> WHERE EXISTS (SELECT * FROM dept WHERE deptno = e.deptno){code}
> after decorrelate 
> (org.apache.calcite.sql2rel.RelDecorrelator#decorrelateQuery(org.apache.calcite.rel.RelNode,
>  org.apache.calcite.tools.RelBuilder))
> {code:java}
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], X=[$9])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], X=[$9], $f0=[$10])
>     LogicalFilter(condition=[IS NOT NULL($10)])
>       LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{7}])
>         LogicalJoin(condition=[true], joinType=[inner])
>           LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>           LogicalValues(tuples=[[{ 1 }]])
>         LogicalAggregate(group=[{}], agg#0=[MIN($0)])
>           LogicalProject($f0=[true])
>             LogicalFilter(condition=[=($0, $cor0.DEPTNO)])
>               LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> LogicalCorrelate will not be processed when the RelNode tree has 
> LogicalValues 
> [https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/sql2rel/RelDecorrelator.java#L491]
>  
> Is this as expected?
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to