[
https://issues.apache.org/jira/browse/HIVE-7436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068253#comment-14068253
]
Chengxiang Li commented on HIVE-7436:
-------------------------------------
[~xuefuz] Thanks for the comments. For the first question, default
master/appname value should bed added in case of missing spark-defaults.conf,
i'll update patch later.
{quote}
Second question: would user be able to set or change the spark configuration
via hive's set command? I guess not, but I'd like to hear your thought.
{quote}
Here are some thoughts about this:
# Spark configurations is configured at application level, which means user can
not reset spark configurations dynamically during spark application. (Spark
application lifecycle is roughly same as the lifecycle of SparkContext instance)
# Change spark configuration via hive set command, means that Spark jobs which
are represent of different hive query commands must be submitted through
different spark applications.
# Currently hive driver run queries in same Spark application(singleton
SparkClient=>singleton SparkContext).
So mostly this question is depends on another one: should hive driver submit
queries in a singleton Spark application, or create separate Spark application
for each query?
# For singleton spark application: little submit cost, fixed cluster resource
in whole hive driver lifecycle.
# For separate spark application on each query: more submit cost(config
loading, dependencies transformation, cluster resource allocation), dynamic
resource application for each query.
Shark use singleton spark application, so its not resource efficient as it can
not dynamicly adjust assigned resources as required. What do you think about
this?
> Load Spark configuration into Hive driver
> -----------------------------------------
>
> Key: HIVE-7436
> URL: https://issues.apache.org/jira/browse/HIVE-7436
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Reporter: Chengxiang Li
> Assignee: Chengxiang Li
> Attachments: HIVE-7436-Spark.1.patch
>
>
> load Spark configuration into Hive driver, there are 3 ways to setup spark
> configurations:
> # Configure properties in spark configuration file(spark-defaults.conf).
> # Java property.
> # System environment.
> Spark support configuration through system environment just for compatible
> with previous scripts, we won't support in Hive on Spark. Hive on Spark load
> defaults from java properties, then load properties from configuration file,
> and override existed properties.
> configuration steps:
> # Create spark-defaults.conf, and place it in the /etc/spark/conf
> configuration directory.
> please refer to [http://spark.apache.org/docs/latest/configuration.html]
> for configuration of spark-defaults.conf.
> # Create the $SPARK_CONF_DIR environment variable and set it to the location
> of spark-defaults.conf.
> export SPARK_CONF_DIR=/etc/spark/conf
> # Add $SAPRK_CONF_DIR to the $HADOOP_CLASSPATH environment variable.
> export HADOOP_CLASSPATH=$SPARK_CONF_DIR:$HADOOP_CLASSPATH
> NO PRECOMMIT TESTS. This is for spark-branch only.
--
This message was sent by Atlassian JIRA
(v6.2#6252)