[ 
https://issues.apache.org/jira/browse/CALCITE-6891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhen Chen updated CALCITE-6891:
-------------------------------
    Description: 
Do we need this rule?

phy plan no use this rule:
{code:java}
EnumerableIntersect(all=[false]): rowcount = 4.0, cumulative cost = {40.0 rows, 
42.0 cpu, 0.0 io}, id = 37
  EnumerableProject(DEPTNO=[$7]): rowcount = 14.0, cumulative cost = {28.0 
rows, 29.0 cpu, 0.0 io}, id = 35
    EnumerableTableScan(table=[[CATALOG, SALES, EMP]]): rowcount = 14.0, 
cumulative cost = {14.0 rows, 15.0 cpu, 0.0 io}, id = 31
  EnumerableProject(DEPTNO=[$0]): rowcount = 4.0, cumulative cost = {8.0 rows, 
9.0 cpu, 0.0 io}, id = 36
    EnumerableTableScan(table=[[CATALOG, SALES, DEPT]]): rowcount = 4.0, 
cumulative cost = {4.0 rows, 5.0 cpu, 0.0 io}, id = 34
{code}
phy plan used this rule:
{code:java}
EnumerableIntersect(all=[false]): rowcount = 4.0, cumulative cost = {40.0 rows, 
42.0 cpu, 0.0 io}, id = 39
  EnumerableProject(DEPTNO=[$0]): rowcount = 4.0, cumulative cost = {8.0 rows, 
9.0 cpu, 0.0 io}, id = 37
    EnumerableTableScan(table=[[CATALOG, SALES, DEPT]]): rowcount = 4.0, 
cumulative cost = {4.0 rows, 5.0 cpu, 0.0 io}, id = 32
  EnumerableProject(DEPTNO=[$7]): rowcount = 14.0, cumulative cost = {28.0 
rows, 29.0 cpu, 0.0 io}, id = 38
    EnumerableTableScan(table=[[CATALOG, SALES, EMP]]): rowcount = 14.0, 
cumulative cost = {14.0 rows, 15.0 cpu, 0.0 io}, id = 35
{code}
This rule put smaller inputs first. This helps reduce the size of intermediate 
results.

We can see the diffferent of DAG, I used volcanol planner and topdown mode.

original

!image-2025-03-16-09-38-41-654.png|width=529,height=379!

used rule:

!image-2025-03-16-09-40-10-496.png|width=528,height=378!

  was:
Do we need this rule?

from:

 
{code:java}
LogicalIntersect(all=[false])
  LogicalProject(DEPTNO=[$7])
    LogicalFilter(condition=[>($7, 10)])
      LogicalTableScan(table=[[CATALOG, SALES, EMP(big)]])
  LogicalProject(DEPTNO=[$0])
    LogicalFilter(condition=[>($0, 5)])
      LogicalTableScan(table=[[CATALOG, SALES, DEPT(small)]]){code}
to

 

 
{code:java}
LogicalIntersect(all=[false])
  LogicalProject(DEPTNO=[$0])
    LogicalFilter(condition=[>($0, 5)])
      LogicalTableScan(table=[[CATALOG, SALES, DEPT(small)]])
  LogicalProject(DEPTNO=[$7])
    LogicalFilter(condition=[>($7, 10)])
      LogicalTableScan(table=[[CATALOG, SALES, EMP(big)]]) {code}
This rule put smaller inputs first. This helps reduce the size of intermediate 
results.

 


> Implement IntersectReorderRule
> ------------------------------
>
>                 Key: CALCITE-6891
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6891
>             Project: Calcite
>          Issue Type: Improvement
>            Reporter: Zhen Chen
>            Assignee: Zhen Chen
>            Priority: Major
>         Attachments: image-2025-03-16-09-37-47-916.png, 
> image-2025-03-16-09-38-41-654.png, image-2025-03-16-09-40-10-496.png
>
>
> Do we need this rule?
> phy plan no use this rule:
> {code:java}
> EnumerableIntersect(all=[false]): rowcount = 4.0, cumulative cost = {40.0 
> rows, 42.0 cpu, 0.0 io}, id = 37
>   EnumerableProject(DEPTNO=[$7]): rowcount = 14.0, cumulative cost = {28.0 
> rows, 29.0 cpu, 0.0 io}, id = 35
>     EnumerableTableScan(table=[[CATALOG, SALES, EMP]]): rowcount = 14.0, 
> cumulative cost = {14.0 rows, 15.0 cpu, 0.0 io}, id = 31
>   EnumerableProject(DEPTNO=[$0]): rowcount = 4.0, cumulative cost = {8.0 
> rows, 9.0 cpu, 0.0 io}, id = 36
>     EnumerableTableScan(table=[[CATALOG, SALES, DEPT]]): rowcount = 4.0, 
> cumulative cost = {4.0 rows, 5.0 cpu, 0.0 io}, id = 34
> {code}
> phy plan used this rule:
> {code:java}
> EnumerableIntersect(all=[false]): rowcount = 4.0, cumulative cost = {40.0 
> rows, 42.0 cpu, 0.0 io}, id = 39
>   EnumerableProject(DEPTNO=[$0]): rowcount = 4.0, cumulative cost = {8.0 
> rows, 9.0 cpu, 0.0 io}, id = 37
>     EnumerableTableScan(table=[[CATALOG, SALES, DEPT]]): rowcount = 4.0, 
> cumulative cost = {4.0 rows, 5.0 cpu, 0.0 io}, id = 32
>   EnumerableProject(DEPTNO=[$7]): rowcount = 14.0, cumulative cost = {28.0 
> rows, 29.0 cpu, 0.0 io}, id = 38
>     EnumerableTableScan(table=[[CATALOG, SALES, EMP]]): rowcount = 14.0, 
> cumulative cost = {14.0 rows, 15.0 cpu, 0.0 io}, id = 35
> {code}
> This rule put smaller inputs first. This helps reduce the size of 
> intermediate results.
> We can see the diffferent of DAG, I used volcanol planner and topdown mode.
> original
> !image-2025-03-16-09-38-41-654.png|width=529,height=379!
> used rule:
> !image-2025-03-16-09-40-10-496.png|width=528,height=378!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to