[ https://issues.apache.org/jira/browse/HIVE-7436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068253#comment-14068253 ]
Chengxiang Li commented on HIVE-7436: ------------------------------------- [~xuefuz] Thanks for the comments. For the first question, default master/appname value should bed added in case of missing spark-defaults.conf, i'll update patch later. {quote} Second question: would user be able to set or change the spark configuration via hive's set command? I guess not, but I'd like to hear your thought. {quote} Here are some thoughts about this: # Spark configurations is configured at application level, which means user can not reset spark configurations dynamically during spark application. (Spark application lifecycle is roughly same as the lifecycle of SparkContext instance) # Change spark configuration via hive set command, means that Spark jobs which are represent of different hive query commands must be submitted through different spark applications. # Currently hive driver run queries in same Spark application(singleton SparkClient=>singleton SparkContext). So mostly this question is depends on another one: should hive driver submit queries in a singleton Spark application, or create separate Spark application for each query? # For singleton spark application: little submit cost, fixed cluster resource in whole hive driver lifecycle. # For separate spark application on each query: more submit cost(config loading, dependencies transformation, cluster resource allocation), dynamic resource application for each query. Shark use singleton spark application, so its not resource efficient as it can not dynamicly adjust assigned resources as required. What do you think about this? > Load Spark configuration into Hive driver > ----------------------------------------- > > Key: HIVE-7436 > URL: https://issues.apache.org/jira/browse/HIVE-7436 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Chengxiang Li > Assignee: Chengxiang Li > Attachments: HIVE-7436-Spark.1.patch > > > load Spark configuration into Hive driver, there are 3 ways to setup spark > configurations: > # Configure properties in spark configuration file(spark-defaults.conf). > # Java property. > # System environment. > Spark support configuration through system environment just for compatible > with previous scripts, we won't support in Hive on Spark. Hive on Spark load > defaults from java properties, then load properties from configuration file, > and override existed properties. > configuration steps: > # Create spark-defaults.conf, and place it in the /etc/spark/conf > configuration directory. > please refer to [http://spark.apache.org/docs/latest/configuration.html] > for configuration of spark-defaults.conf. > # Create the $SPARK_CONF_DIR environment variable and set it to the location > of spark-defaults.conf. > export SPARK_CONF_DIR=/etc/spark/conf > # Add $SAPRK_CONF_DIR to the $HADOOP_CLASSPATH environment variable. > export HADOOP_CLASSPATH=$SPARK_CONF_DIR:$HADOOP_CLASSPATH > NO PRECOMMIT TESTS. This is for spark-branch only. -- This message was sent by Atlassian JIRA (v6.2#6252)