[
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808324#comment-16808324
]
Lai Zhou edited comment on CALCITE-2973 at 4/3/19 3:51 AM:
-----------------------------------------------------------
[~julianhyde],[~zabetak] , good idea.
I just create a new rule for my application, to avoid changing the
calcite-core.
I'll make a PR later to allow theta joins to be executed using a merge join or
hash join.
I draw a table to describe the relationship of join types and join operators
after re-desgined:
|| ||inner||non-inner ||
|*only equi condition*|EnumerableJoin|EnumerableJoin |
|*only* *non-equi condition*|EnumerableJoin|EnumerableThetaJoin |
|*mixed equi and non-equi condition*|EnumerableJoin+
EnumerableFilter
or
EnumerableMergeJoin
(changed)+
EnumerableFilter
|EnumerableThetaJoin
or
EnumerableMergeJoin
(changed)
or
EnumerableHashJoin
(new)|
If a join is non-inner and has equi and non-equi conditions meanwhile, we
have 3 choice to plan it.
Now EnumerableThetaJoin and EnumerableMergeJoin have a corresponding rule
respectively,
What do you think if I introduce a new rule( EnumerableThetaHashJoinRule) to
allow theta joins to be executed using a hash join?
was (Author: hhlai1990):
[~julianhyde],[~zabetak] , good idea.
I just create a new rule for my application, to avoid changing the
calcite-core.
I'll make a PR later to allow theta joins to be executed using a merge join or
hash join.
I draw a table to describe the relationship of join types and join operators
after re-desgined:
|| ||inner||non-inner ||
|*only equi condition*|EnumerableJoin|EnumerableJoin |
|*only* *non-equi condition*|EnumerableJoin|EnumerableThetaJoin |
|*mixed equi and non-equi condition*|EnumerableJoin+
EnumerableFilter
or
EnumerableMergeJoin
(changed)+
EnumerableFilter
|EnumerableThetaJoin
or
EnumerableMergeJoin
(changed)
or
EnumerableHashJoin
(new)|
If a join is non-inner and has ** equi and non-equi condition meanwhile, we
have 3 choice to plan it.
Now EnumerableThetaJoin and EnumerableMergeJoin have a corresponding rule
respectively,
What do you think if I introduce a new rule( EnumerableThetaHashJoinRule) to
allow theta joins to be executed using a hash join?
> Allow theta joins to be executed using a merge join algorithm
> -------------------------------------------------------------
>
> Key: CALCITE-2973
> URL: https://issues.apache.org/jira/browse/CALCITE-2973
> Project: Calcite
> Issue Type: New Feature
> Components: core
> Affects Versions: 1.19.0
> Reporter: Lai Zhou
> Priority: Minor
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query for a large dataset (such as 10000*10000),
> the nested-loop join process will take dozens of time than the sort-merge
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will
> improve the performance greatly.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)