[ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16840269#comment-16840269
 ] 

Ruben Quesada Lopez commented on CALCITE-2973:
----------------------------------------------

[~hhlai1990], I was just running the tests locally and I reached the same 
conclusion as you. I'll try to come up with a solution, but it seems tricky due 
to the "EquiJoin-oriented" design.
Otherwise, I think that your proposed solution of having a new 
EnumerableThetaHashJoin, and keeping the existing EnumerableThetaJoin (to be 
renamed as EnumerableNestedLoopJoin) and EnumerableJoin (to be renamed as 
EnumerableHashJoin) is the most straightforward and less harmful solution. But 
I'm not sure if others will agree.

> Allow theta joins that have equi conditions to be executed using a hash join 
> algorithm
> --------------------------------------------------------------------------------------
>
>                 Key: CALCITE-2973
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2973
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 1.19.0
>            Reporter: Lai Zhou
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 1.20.0
>
>          Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query  for a large dataset (such as 10000*10000), 
> the nested-loop join process will take dozens of time than the sort-merge 
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will 
> improve the performance greatly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to