[ 
https://issues.apache.org/jira/browse/HIVE-8700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14201399#comment-14201399
 ] 

Szehon Ho commented on HIVE-8700:
---------------------------------

If we push that down to SparkMapJoinResolver that would be great (depends if 
there are still parents on mapJoinOp at that point, which is what it replaces). 
 If not, we might have to do this here as well.

> Replace ReduceSink to HashTableSink (or equi.) for small tables [Spark Branch]
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-8700
>                 URL: https://issues.apache.org/jira/browse/HIVE-8700
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Xuefu Zhang
>            Assignee: Suhas Satish
>         Attachments: HIVE-8700-spark.patch, HIVE-8700.2-spark.patch, 
> HIVE-8700.3-spark.patch, HIVE-8700.patch
>
>
> With HIVE-8616 enabled, the new plan has ReduceSinkOperator for the small 
> tables. For example, the follow represents the operator plan for the small 
> table dec1 derived from query {code}explain select /*+ MAPJOIN(dec)*/ * from 
> dec join dec1 on dec.value=dec1.d;{code}
> {code}
>         Map 2 
>             Map Operator Tree:
>                 TableScan
>                   alias: dec1
>                   Statistics: Num rows: 0 Data size: 107 Basic stats: PARTIAL 
> Column stats: NONE
>                   Filter Operator
>                     predicate: d is not null (type: boolean)
>                     Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
> Column stats: NONE
>                     Reduce Output Operator
>                       key expressions: d (type: decimal(5,2))
>                       sort order: +
>                       Map-reduce partition columns: d (type: decimal(5,2))
>                       Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
> Column stats: NONE
>                       value expressions: i (type: int)
> {code}
> With the new design for broadcasting small tables, we need to convert the 
> ReduceSinkOperator with HashTableSinkOperator or equivalent in the new plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to