[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]

Rui Li (JIRA) Wed, 07 Jan 2015 04:36:13 -0800

     [ 
https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Rui Li updated HIVE-9251:
-------------------------
    Attachment: HIVE-9251.3-spark.patch

Addressed RB comments and updated golden files. Some notes about the reducer 
count changes:
* Most queries changed from 3 to 2. We're using total cores to set # reducer 
here. But previously we counted driver as an executor so we have 1 more 
executor as we really do. Actually neither count is correct because in fact we 
have 4 cores (2 executors each with 2 cores), however we can't get cores per 
executor info.
* Some queries changed from 3 to 1. That's because {{hive.exec.reducers.max}} 
is set to 1 but we previously didn't respect it.
* Some queries need deterministic results and that's tracked by HIVE-9290.

> SetSparkReducerParallelism is likely to set too small number of reducers 
> [Spark Branch]
> ---------------------------------------------------------------------------------------
>
>                 Key: HIVE-9251
>                 URL: https://issues.apache.org/jira/browse/HIVE-9251
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Rui Li
>            Assignee: Rui Li
>         Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch, 
> HIVE-9251.3-spark.patch
>
>
> This may hurt performance or even lead to task failures. For example, spark's 
> netty-based shuffle limits the max frame size to be 2G.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]

Reply via email to