Hello, I am prototyping a change in the behavior of spark.jars conf for my use-case. spark.jars conf is used to specify a list of jars to include on the driver and executor classpaths.
*Current behavior:* spark.jars conf value is not read until after the JVM has already started and the system classloader has already loaded, and hence the jars added using this conf get “appended” to the spark classpath. This means that spark looks for the jar in its default classpath first and then looks at the path specified in spark.jars conf. *Proposed prototype:* I am proposing a new behavior where we can have spark.jars take precedence over spark default classpath in terms of how jars are discovered. This can be achieved by using spark.{driver,executor}.extraClassPath conf. This conf modifies the actual launch command of the driver (or executors), and hence this path is "prepended" to the classpath and thus takes precedence over the default classpath. Can the behavior of conf spark.jars be modified by adding the conf value of spark.jars to conf value of spark.{driver,executor}.extraClassPath during argument parsing in SparkSubmitArguments.scala <https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala#L151> , so that we can achieve precedence order of jars specified in spark.jars > spark.{driver,executor}.extraClassPath > spark default classpath (left to right precedence order) *Pseudo sample code:* In loadEnvironmentArguments() <https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala#L151> : /if (jars != null) { if (driverExtraClassPath != null) { driverExtraClassPath = driverExtraClassPath + "," + jars } else { driverExtraClassPath = jars } }/ *As an example*, consider jars : sample-jar-1.0.0.jar present in spark’s default classpath sample-jar-2.0.0.jar present on all nodes of the cluster at path /<somepath>/ new-jar-1.0.0.jar present on all nodes of the cluster at path /<somepath>/ (and not in spark default classpath) And two scenarios 2 spark jobs are submitted with the following – jars conf values <http://apache-spark-developers-list.1001551.n3.nabble.com/file/t3705/Capture.png> What are your thoughts on this? Could this have any undesired side-effects? Or has this already been explored and there are some known issues with this approach? Thanks, Nupur -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org