Hi Spark users, In past Spark releases I always had to add jars to multiple places when using the spark-shell, and I'm looking to cut down on those. The --jars option looks like it does what I want, but it doesn't work. I did a quick experiment on latest branch-1.0 and found this:
*# 0) jar not added anywhere* ./bin/spark-shell --master spark://aash-mbp.local:7077 spark> import org.joda.time.DateTime [fails -- expected because the .jar isn't anywhere] *# 1) just --jars* ./bin/spark-shell --master spark://aash-mbp.local:7077 --jars /tmp/joda-time-2.3.jar spark> import org.joda.time.DateTime [fails -- but might work on non-standalone clusters?] *# 2) using --jars and sc.addJar()* ./bin/spark-shell --master spark://aash-mbp.local:7077 --jars /tmp/joda-time-2.3.jar spark> sc.addJar("/tmp/joda-time-2.3.jar") spark> import org.joda.time.DateTime [fails -- shouldn't sc.addJar() make imports possible?] *# 3) just --driver-class-path* ./bin/spark-shell --master spark://aash-mbp.local:7077 --driver-class-path /tmp/joda-time-2.3.jar spark> import org.joda.time.DateTime spark> new DateTime() res0: org.joda.time.DateTime = 2014-05-29T11:10:56.745-07:00 spark> sc.parallelize(1 to 10).map(k => new DateTime()).collect [fails -- expected because jar wasn't ever sent to executors, only driver] *# 4) using --driver-class-path and sc.addJar()* ./bin/spark-shell --master spark://aash-mbp.local:7077 --driver-class-path /tmp/joda-time-2.3.jar spark> import org.joda.time.DateTime spark> sc.addJar("/tmp/joda-time-2.3.jar") spark> new DateTime() res0: org.joda.time.DateTime = 2014-05-29T11:10:56.745-07:00 spark> sc.parallelize(1 to 10).map(k => new DateTime()).collect [success!] Looking at the documentation for --jars, it looks like --jars doesn't work with standalone in cluster deployment mode. Here are the relevant doc entries: --jars JARS A comma-separated list of local jars to include on the driver classpath and that SparkContext.addJar will work with. Doesn't work on standalone with 'cluster' deploy mode. --driver-class-path Extra class path entries to pass to the driver. Note that jars added with --jars are automatically included in the classpath. For the --jars comment about not working with standalone, is this something that can be fixed to make the "1) just --jars" path above work? Or is there some larger architecture reason that --jars can't work with standalone mode? Appreciate it! Andrew