[jira] [Updated] (HIVE-8638) Implement bucket map join optimization [Spark Branch]

Jimmy Xiang (JIRA) Mon, 08 Dec 2014 14:54:07 -0800

     [ 
https://issues.apache.org/jira/browse/HIVE-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jimmy Xiang updated HIVE-8638:
------------------------------
    Attachment: HIVE-8638.5-spark.patch

Added patch v5 that added some comments, fixed the golden file for 
auto_sortmerge_join_11.q. The change to the golden file is because we do 
bucketmap join optimization if hive.optimize.bucketmapjoin is set. Originally, 
it does such optimization only if mapjoin hints is set. This test looks like to 
be better called some bucket mapjoin test.

> Implement bucket map join optimization [Spark Branch]
> -----------------------------------------------------
>
>                 Key: HIVE-8638
>                 URL: https://issues.apache.org/jira/browse/HIVE-8638
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Na Yang
>            Assignee: Jimmy Xiang
>             Fix For: spark-branch
>
>         Attachments: HIVE-8638.4-spark.patch, HIVE-8638.5-spark.patch
>
>
> In the hive-on-mr implementation, bucket map join optimization has to depend 
> on the map join hint. While in the hive-on-tez implementation, a join can be 
> automatically converted to bucket map join if certain conditions are met such 
> as: 
> 1. the optimization flag hive.convert.join.bucket.mapjoin.tez is ON
> 2. all join tables are buckets and each small table's bucket number can be 
> divided by big table's bucket number
> 3. bucket columns == join columns
> In the hive-on-spark implementation, it is ideal to have the bucket map join 
> auto-convertion support. when all the required criteria are met, a join can 
> be automatically converted to a bucket map join.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8638) Implement bucket map join optimization [Spark Branch]

Reply via email to