[
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16815943#comment-16815943
]
Lai Zhou commented on CALCITE-2973:
-----------------------------------
[~julianhyde],[~zabetak],[~hyuan]
I make a PR to improve the EnumerableJoin.
Since EnumerableMergeJoin is never taken ,I change the summary to "Allow theta
joins that have equi conditions to be executed using a hash join algorithm."
Now a join rel node will be converted to an EnumerableJoin if it has mixed
equi and non-equi conditions.
see
[https://github.com/apache/calcite/blob/2251c82f209612d8ae31e2e7a42acdb2bcb15d55/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableJoinRule.java#L62|https://github.com/apache/calcite/blob/2251c82f209612d8ae31e2e7a42acdb2bcb15d55/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableJoinRule.java#L62]
Now EnumerableJoin can handle a per-row condition, I introduce a the
remainCondition to generate the predicate for the join.
see
[https://github.com/apache/calcite/blob/2251c82f209612d8ae31e2e7a42acdb2bcb15d55/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableJoin.java#L250|https://github.com/apache/calcite/blob/2251c82f209612d8ae31e2e7a42acdb2bcb15d55/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableJoin.java#L250]
I also introduce a new method to support join with predicate, it doesn't
affect the old join method .
see
[https://github.com/apache/calcite/blob/2251c82f209612d8ae31e2e7a42acdb2bcb15d55/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L1061|https://github.com/apache/calcite/blob/2251c82f209612d8ae31e2e7a42acdb2bcb15d55/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L1061]
> Allow theta joins that have equi conditions to be executed using a hash join
> algorithm
> --------------------------------------------------------------------------------------
>
> Key: CALCITE-2973
> URL: https://issues.apache.org/jira/browse/CALCITE-2973
> Project: Calcite
> Issue Type: New Feature
> Components: core
> Affects Versions: 1.19.0
> Reporter: Lai Zhou
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query for a large dataset (such as 10000*10000),
> the nested-loop join process will take dozens of time than the sort-merge
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will
> improve the performance greatly.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)