[
https://issues.apache.org/jira/browse/CALCITE-7029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhen Chen resolved CALCITE-7029.
--------------------------------
Fix Version/s: 1.41.0
Resolution: Fixed
> Support DPhyp to handle various join types
> ------------------------------------------
>
> Key: CALCITE-7029
> URL: https://issues.apache.org/jira/browse/CALCITE-7029
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.39.0
> Reporter: Silun Dong
> Assignee: Silun Dong
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.41.0
>
>
> Add conflict detection algorithm CD-C to DPhyp so that it can handle various
> join types.
> DpHyp algorithm is a join reorder algorithm based on dynamic programming. It
> can enumerate all possibilities of the query graph without duplication, and
> it can theoretically handle complex join predicates and various types of
> joins (outer, semi, anti, etc.).
> CALCITE-6846 completed the basic DpHyp algorithm, defined the hypergraph
> structure and enumeration process. It can enumerate the query graph without
> duplication and handle complex join predicates, but is limited to inner join.
> The ability to handle various types of joins requires conflict detection.
> This pr implements the CD-C conflict detection algorithm based on the paper
> {_}On the correct and complete enumeration of the core search space{_}. The
> conflict detection algorithm does not change the enumeration process of
> DpHyp. It calculates the conflict rules for each join operator in the process
> of constructing the hypergraph from the plan tree, and verifies the
> applicability of csg-cmp according to the conflict rules when DpHyp
> enumerates csg-cmp.
>
> The following is an example of sql and the expected plan:
> sql
> {code:java}
> select emp.empno from
> emp_address inner join emp on emp_address.empno = emp.empno
> left join dept on emp.deptno = dept.deptno
> inner join dept_nested on dept.deptno = dept_nested.deptno {code}
> initial plan
> {code:java}
> LogicalProject(EMPNO=[$3])
> LogicalJoin(condition=[=($12, $14)], joinType=[inner])
> LogicalJoin(condition=[=($10, $12)], joinType=[left])
> LogicalJoin(condition=[=($0, $3)], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP_ADDRESS]])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]]) {code}
> build hypergraph
> {code:java}
> LogicalProject(EMPNO=[$3])
> HyperGraph(edges=[{0}——[INNER, =(vertex(0)_field(0),
> vertex(1)_field(0))]——{1},{1}——[LEFT, =(vertex(1)_field(7),
> vertex(2)_field(0))]——{2},{1, 2}——[INNER, =(vertex(2)_field(0),
> vertex(3)_field(0))]——{3}])
> LogicalTableScan(table=[[CATALOG, SALES, EMP_ADDRESS]])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]]) {code}
> after dphyp
> {code:java}
> LogicalProject(EMPNO=[$4])
> LogicalJoin(condition=[=($15, $4)], joinType=[inner])
> LogicalJoin(condition=[=($13, $0)], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]])
> LogicalJoin(condition=[=($7, $9)], joinType=[left])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> LogicalTableScan(table=[[CATALOG, SALES, EMP_ADDRESS]]) {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)