[ 
https://issues.apache.org/jira/browse/HIVE-8412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14165620#comment-14165620
 ] 

Xuefu Zhang edited comment on HIVE-8412 at 10/9/14 7:43 PM:
------------------------------------------------------------

Certain configurations exist in many join related q tests to select a 
particular type of join, such as hive.auto.convert.sortmerge.join.to.mapjoin. 
Those configurations have impact on the operator tree and such manipulation is 
done at semantic analyzer, which occurs before task compilation. Since 
currently not all type of join are supported in Spark. Thus, we like to fall 
back these joins to regular, reduce side join. In doing that, we need to turn 
off these operator manipulations. The goal here is, all join should pass 
whether a particual optimized join is implemented or not.

These manipulations can be turned on if needed when a particular join type is 
implemented in Spark.


was (Author: xuefuz):
Certain configurations exist in many join related q tests to select a 
particular type of join, such as hive.auto.convert.sortmerge.join.to.mapjoin. 
Those configurations have impact on the operator tree and such manipulation is 
done at semantic analyzer, which occurs before task compilation. Since 
currently not all type of join are supported in Spark. Thus, we like to fall 
back these joins to regular, reduce side join. In doing that, we need to turn 
off these operator manipulations.

> Make reduce side join work for all join queries [Spark Branch]
> --------------------------------------------------------------
>
>                 Key: HIVE-8412
>                 URL: https://issues.apache.org/jira/browse/HIVE-8412
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Xuefu Zhang
>         Attachments: HIVE-8412.1-spark.patch
>
>
> Regardless all these join related optimizations such as map join, bucket 
> join, skewed join, etc, reduce side join is the fallback. That means, if a 
> join query wasn't taken care of by any of the optimization, it should work 
> with reduce side join (might in a less optimal fashion).
> It's found that this isn't case at the moment. For instance, 
> auto_sortmerge_join_1.q failed to execute on Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to