[ 
https://issues.apache.org/jira/browse/HIVE-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457947#comment-13457947
 ] 

Nadeem Moidu commented on HIVE-3086:
------------------------------------

Yes, in the current implementation, both the tables will be scanned twice. This 
can be avoided if the table scan operator is not replicated and has multiple 
children instead, but this optimization has not been done in this patch.
                
> Skewed Join Optimization
> ------------------------
>
>                 Key: HIVE-3086
>                 URL: https://issues.apache.org/jira/browse/HIVE-3086
>             Project: Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Nadeem Moidu
>            Assignee: Namit Jain
>             Fix For: 0.10.0
>
>         Attachments: hive.3086.1.patch, hive.3086.2.patch, hive.3086.3.patch, 
> hive.3086.4.patch, hive.3086.5.patch, hive.3086.6.patch
>
>
> During a join operation, if one of the columns has a skewed key, it can cause 
> that particular reducer to become the bottleneck. The following feature will 
> address it:
> https://cwiki.apache.org/confluence/display/Hive/Skewed+Join+Optimization

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to