[ 
https://issues.apache.org/jira/browse/HIVE-11363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14650174#comment-14650174
 ] 

Lefty Leverenz commented on HIVE-11363:
---------------------------------------

Doc note:  This changes the descriptions of two configuration parameters 
(*hive.prewarm.enabled* and *hive.prewarm.numcontainers*) in HiveConf.java, 
removing the words "for Tez" -- the parameters are currently documented in the 
Tez section of Configuration Properties.

Question:  Should they be kept in the Tez section and also added to the Spark 
section?  (Alternatively, they could go in the general section with 
crossreferences in the Tez and Spark sections.)

* [Configuration Properties -- Tez -- hive.prewarm.enabled | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.prewarm.enabled]
* [Configuration Properties -- Tez -- hive.prewarm.numcontainers | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.prewarm.numcontainers]
* [Configuration Properties -- Spark | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Spark]

They should also be documented in Hive on Spark: Getting Started.

* [Hive on Spark: Getting Started -- Configuring Hive | 
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started#HiveonSpark:GettingStarted-ConfiguringHive]

However, when the Spark branch was merged to master (7/30/2015) the commit for 
this issue seems to have used an earlier patch, not patch 5 -- it creates two 
new parameters (*hive.spark.prewarm.containers* & 
*hive.spark.prewarm.num.containers*).  That needs to be sorted out.  See 
HIVE-10166 and commit 537114b964c71b7a5cd00c9938eadc6d0cf76536.

> Prewarm Hive on Spark containers [Spark Branch]
> -----------------------------------------------
>
>                 Key: HIVE-11363
>                 URL: https://issues.apache.org/jira/browse/HIVE-11363
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: 1.1.0
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>              Labels: TODOC-SPARK
>             Fix For: spark-branch
>
>         Attachments: HIVE-11363.1-spark.patch, HIVE-11363.2-spark.patch, 
> HIVE-11363.3-spark.patch, HIVE-11363.4-spark.patch, HIVE-11363.5-spark.patch
>
>
> When Hive job is launched by Oozie, a Hive session is created and job script 
> is executed. Session is closed when Hive job is completed. Thus, Hive session 
> is not shared among Hive jobs either in an Oozie workflow or across 
> workflows. Since the parallelism of a Hive job executed on Spark is impacted 
> by the available executors, such Hive jobs will suffer the executor ramp-up 
> overhead. The idea here is to wait a bit so that enough executors can come up 
> before a job can be executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to