I have about 20 environment variables to pass to my Spark workers. Even though they're in the init scripts on the Linux box, the workers don't see these variables.
Does Spark do something to shield itself from what may be defined in the environment? I see multiple pieces of info on how to pass the env vars into workers and they seem dated and/or unclear. Here: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-pass-config-variables-to-workers-tt5780.html SparkConf conf = new SparkConf(); conf.set("spark.myapp.myproperty", "propertyValue"); OR set them in spark-defaults.conf, as in spark.config.one value spark.config.two value2 In another posting, http://apache-spark-user-list.1001560.n3.nabble.com/How-to-set-environment-variable-for-a-spark-job-tt3180.html: conf.setExecutorEnv("ORACLE_HOME", myOraHome) conf.setExecutorEnv("SPARK_JAVA_OPTS", "-Djava.library.path=/my/custom/path") The configuration guide talks about "spark.executorEnv.[EnvironmentVariableName] -- Add the environment variable specified by EnvironmentVariableName to the Executor process. The user can specify multiple of these to set multiple environment variables." Then there are mentions of SPARK_JAVA_OPTS which seems to be deprecated (?) What is the easiest/cleanest approach here? Ideally, I'd not want to burden my driver program with explicit knowledge of all the env vars that are needed on the worker side. I'd also like to avoid having to jam them into spark-defaults.conf since they're already set in the system init scripts, so why duplicate. I suppose one approach would be to namespace all my vars to start with a well-known prefix, then cycle through the env in the driver and stuff all these variables into the Spark context. If I'm doing that, would I want to conf.set("spark.myapp.myproperty", "propertyValue"); and is "spark." necessary? or was that just part of the example? or would I want to conf.setExecutorEnv("MYPREFIX_MY_VAR_1", "some-value"); Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/What-is-a-best-practice-for-passing-environment-variables-to-Spark-workers-tp23751.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org