[ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16838592#comment-16838592
 ] 

Ruben Quesada Lopez commented on CALCITE-2973:
----------------------------------------------

[~hhlai1990], I have just checked the PR, it looks very promising.
I have a small doubt that I would like to share. Regarding "partial equi" 
joins, we would have two possibilities:
- 1. Partial equi non-inner join: Enumerable(Hash)Join with remaining condition
- 2. Partial equi inner join: : Enumerable(Hash)Join (remaining condition null) 
+ EnumerableFilter with remaining condition

For the sake of consistency and code simplicity, I wonder what is the advantage 
of 2 vs 1 (if any), and if we should not remove option 2 and handle the inner 
join case also using approach 1.



> Allow theta joins that have equi conditions to be executed using a hash join 
> algorithm
> --------------------------------------------------------------------------------------
>
>                 Key: CALCITE-2973
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2973
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 1.19.0
>            Reporter: Lai Zhou
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 1.20.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query  for a large dataset (such as 10000*10000), 
> the nested-loop join process will take dozens of time than the sort-merge 
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will 
> improve the performance greatly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to