[ 
https://issues.apache.org/jira/browse/CALCITE-7029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhen Chen resolved CALCITE-7029.
--------------------------------
    Fix Version/s: 1.41.0
       Resolution: Fixed

> Support DPhyp to handle various join types
> ------------------------------------------
>
>                 Key: CALCITE-7029
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7029
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.39.0
>            Reporter: Silun Dong
>            Assignee: Silun Dong
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.41.0
>
>
> Add conflict detection algorithm CD-C to DPhyp so that it can handle various 
> join types. 
> DpHyp algorithm is a join reorder algorithm based on dynamic programming. It 
> can enumerate all possibilities of the query graph without duplication, and 
> it can theoretically handle complex join predicates and various types of 
> joins (outer, semi, anti, etc.).
> CALCITE-6846 completed the basic DpHyp algorithm, defined the hypergraph 
> structure and enumeration process. It can enumerate the query graph without 
> duplication and handle complex join predicates, but is limited to inner join. 
> The ability to handle various types of joins requires conflict detection.
> This pr implements the CD-C conflict detection algorithm based on the paper 
> {_}On the correct and complete enumeration of the core search space{_}. The 
> conflict detection algorithm does not change the enumeration process of 
> DpHyp. It calculates the conflict rules for each join operator in the process 
> of constructing the hypergraph from the plan tree, and verifies the 
> applicability of csg-cmp according to the conflict rules when DpHyp 
> enumerates csg-cmp.
>  
> The following is an example of sql and the expected plan:
> sql
> {code:java}
> select emp.empno from 
> emp_address inner join emp on emp_address.empno = emp.empno 
> left join dept on emp.deptno = dept.deptno 
> inner join dept_nested on dept.deptno = dept_nested.deptno {code}
> initial plan
> {code:java}
> LogicalProject(EMPNO=[$3])
>   LogicalJoin(condition=[=($12, $14)], joinType=[inner])
>     LogicalJoin(condition=[=($10, $12)], joinType=[left])
>       LogicalJoin(condition=[=($0, $3)], joinType=[inner])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP_ADDRESS]])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>       LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>     LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]]) {code}
> build hypergraph
> {code:java}
> LogicalProject(EMPNO=[$3])
>   HyperGraph(edges=[{0}——[INNER, =(vertex(0)_field(0), 
> vertex(1)_field(0))]——{1},{1}——[LEFT, =(vertex(1)_field(7), 
> vertex(2)_field(0))]——{2},{1, 2}——[INNER, =(vertex(2)_field(0), 
> vertex(3)_field(0))]——{3}])
>     LogicalTableScan(table=[[CATALOG, SALES, EMP_ADDRESS]])
>     LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>     LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>     LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]]) {code}
> after dphyp
> {code:java}
> LogicalProject(EMPNO=[$4])
>   LogicalJoin(condition=[=($15, $4)], joinType=[inner])
>     LogicalJoin(condition=[=($13, $0)], joinType=[inner])
>       LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]])
>       LogicalJoin(condition=[=($7, $9)], joinType=[left])
>         LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>         LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
>     LogicalTableScan(table=[[CATALOG, SALES, EMP_ADDRESS]]) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to