[ 
https://issues.apache.org/jira/browse/IMPALA-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18063610#comment-18063610
 ] 

Joe McDonnell commented on IMPALA-14436:
----------------------------------------

For a query like this:
{noformat}
select count(*) from table_a a right outer join table_b b on (a.foo < b.foo) 
where a.bar < b.bar;{noformat}
The "a.foo < b.foo" inside the on is a join predicate. It is used during the 
join to determine which rows match. The "a.bar < b.bar" in the where is a 
regular predicate and it applies to the output of the join. For outer joins, 
the distinction is significant: the nulls introduced by the outer join will be 
eliminated by the where clause. For inner joins, it doesn't matter: the row has 
to survive both anyway and the there are no nulls introduced.

For Impala's planner, when the nested loop join is an inner join, it puts 
everything in the predicates rather than the join predicates. For other join 
types, it keeps them separate. I don't think there is an equivalent 
circumstance around hash joins. So, the fix here is to put the join conjuncts 
in the conjuncts if the nested loop join is an inner join.

> Calcite Planner: Implement single row join optimization
> -------------------------------------------------------
>
>                 Key: IMPALA-14436
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14436
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Steve Carlin
>            Priority: Major
>
> Impala has special code to handle joins that produce one row on the right 
> side. This needs to be implemented in Calcite



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to