I have about 20 environment variables to pass to my Spark workers. Even
though they're in the init scripts on the Linux box, the workers don't see
these variables.

Does Spark do something to shield itself from what may be defined in the
environment?

I see multiple pieces of info on how to pass the env vars into workers and
they seem dated and/or unclear.

Here:
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-pass-config-variables-to-workers-tt5780.html

SparkConf conf = new SparkConf(); 
conf.set("spark.myapp.myproperty", "propertyValue"); 
OR
set them in spark-defaults.conf, as in
spark.config.one value
spark.config.two value2

In another posting,
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-set-environment-variable-for-a-spark-job-tt3180.html:
conf.setExecutorEnv("ORACLE_HOME", myOraHome) 
conf.setExecutorEnv("SPARK_JAVA_OPTS",
"-Djava.library.path=/my/custom/path") 

The configuration guide talks about
"spark.executorEnv.[EnvironmentVariableName] -- Add the environment variable
specified by EnvironmentVariableName to the Executor process. The user can
specify multiple of these to set multiple environment variables."

Then there are mentions of SPARK_JAVA_OPTS which seems to be deprecated (?)

What is the easiest/cleanest approach here?  Ideally, I'd not want to burden
my driver program with explicit knowledge of all the env vars that are
needed on the worker side.  I'd also like to avoid having to jam them into
spark-defaults.conf since they're already set in the system init scripts, so
why duplicate.

I suppose one approach would be to namespace all my vars to start with a
well-known prefix, then cycle through the env in the driver and stuff all
these variables into the Spark context.  If I'm doing that, would I want to 

conf.set("spark.myapp.myproperty", "propertyValue");

and is "spark." necessary? or was that just part of the example?

or would I want to

conf.setExecutorEnv("MYPREFIX_MY_VAR_1", "some-value");

Thanks.







--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/What-is-a-best-practice-for-passing-environment-variables-to-Spark-workers-tp23751.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to