We were previously using SPARK_JAVA_OPTS to set java system properties via
-D.
This was used for properties that varied on a per-deployment-environment
basis, but needed to be available in the spark shell and workers.
On upgrading to 1.0, we saw that SPARK_JAVA_OPTS had been deprecated, and
replaced by spark-defaults.conf and command line arguments to spark-submit
or spark-shell.
However, setting spark.driver.extraJavaOptions and
spark.executor.extraJavaOptions in spark-defaults.conf is not a replacement
for SPARK_JAVA_OPTS:
$ cat conf/spark-defaults.conf
spark.driver.extraJavaOptions=-Dfoo.bar.baz=23
$ ./bin/spark-shell
scala> System.getProperty("foo.bar.baz")
res0: String = null
$ ./bin/spark-shell --driver-java-options "-Dfoo.bar.baz=23"
scala> System.getProperty("foo.bar.baz")
res0: String = 23
Looking through the shell scripts for spark-submit and spark-class, I can
see why this is; parsing spark-defaults.conf from bash could be brittle.
But from an ergonomic point of view, it's a step back to go from a
set-it-and-forget-it configuration in spark-env.sh, to requiring command
line arguments.
I can solve this with an ad-hoc script to wrap spark-shell with the
appropriate arguments, but I wanted to bring the issue up to see if anyone
else had run into it,
or had any direction for a general solution (beyond parsing java properties
files from bash).