[
https://issues.apache.org/jira/browse/IMPALA-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18063610#comment-18063610
]
Joe McDonnell commented on IMPALA-14436:
----------------------------------------
For a query like this:
{noformat}
select count(*) from table_a a right outer join table_b b on (a.foo < b.foo)
where a.bar < b.bar;{noformat}
The "a.foo < b.foo" inside the on is a join predicate. It is used during the
join to determine which rows match. The "a.bar < b.bar" in the where is a
regular predicate and it applies to the output of the join. For outer joins,
the distinction is significant: the nulls introduced by the outer join will be
eliminated by the where clause. For inner joins, it doesn't matter: the row has
to survive both anyway and the there are no nulls introduced.
For Impala's planner, when the nested loop join is an inner join, it puts
everything in the predicates rather than the join predicates. For other join
types, it keeps them separate. I don't think there is an equivalent
circumstance around hash joins. So, the fix here is to put the join conjuncts
in the conjuncts if the nested loop join is an inner join.
> Calcite Planner: Implement single row join optimization
> -------------------------------------------------------
>
> Key: IMPALA-14436
> URL: https://issues.apache.org/jira/browse/IMPALA-14436
> Project: IMPALA
> Issue Type: Sub-task
> Reporter: Steve Carlin
> Priority: Major
>
> Impala has special code to handle joins that produce one row on the right
> side. This needs to be implemented in Calcite
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]