[ https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rui Li updated HIVE-9251: ------------------------- Attachment: HIVE-9251.3-spark.patch Addressed RB comments and updated golden files. Some notes about the reducer count changes: * Most queries changed from 3 to 2. We're using total cores to set # reducer here. But previously we counted driver as an executor so we have 1 more executor as we really do. Actually neither count is correct because in fact we have 4 cores (2 executors each with 2 cores), however we can't get cores per executor info. * Some queries changed from 3 to 1. That's because {{hive.exec.reducers.max}} is set to 1 but we previously didn't respect it. * Some queries need deterministic results and that's tracked by HIVE-9290. > SetSparkReducerParallelism is likely to set too small number of reducers > [Spark Branch] > --------------------------------------------------------------------------------------- > > Key: HIVE-9251 > URL: https://issues.apache.org/jira/browse/HIVE-9251 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Rui Li > Assignee: Rui Li > Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch, > HIVE-9251.3-spark.patch > > > This may hurt performance or even lead to task failures. For example, spark's > netty-based shuffle limits the max frame size to be 2G. -- This message was sent by Atlassian JIRA (v6.3.4#6332)