Brock Noland created HIVE-7768:
----------------------------------

             Summary: Research growing/shrinking our Spark Application
                 Key: HIVE-7768
                 URL: https://issues.apache.org/jira/browse/HIVE-7768
             Project: Hive
          Issue Type: Sub-task
            Reporter: Brock Noland


Similar to Tez, it's likely our "SparkContext" is going to be long lived and 
process many queries. Some queries will be large and some small. Additionally 
the SC might be idle for long periods of time.

In this JIRA we will research the following:

* How Spark decides the number of slaves for a given RDD today
* Given a SC when you create a new RDD based on a much larger input dataset, 
does the SC adjust?
* How Tez increases/decreases the size of the running YARN application (set of 
slaves)
* How Tez handles scenarios when it has a running set of slaves in YARN and 
requests more resources for a query and fails to get additional resources
* How Tez decides to timeout idle slaves

This will guide requirements we'll need from YARN.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to