Github user sryza commented on a diff in the pull request: https://github.com/apache/spark/pull/119#discussion_r10507644 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -130,6 +130,16 @@ class SparkContext( val isLocal = (master == "local" || master.startsWith("local[")) + // Create a classLoader for use by the driver so that jars added via addJar are available to the + // driver. Do this before all other initialization so that any thread pools created for this + // SparkContext uses the class loader. + // Note that this is config-enabled as classloaders can introduce subtle side effects + private[spark] val classLoader = if (conf.getBoolean("spark.driver.loadAddedJars", false)) { + val loader = new SparkURLClassLoader(Array.empty[URL], this.getClass.getClassLoader) + Thread.currentThread.setContextClassLoader(loader) --- End diff -- Ah, ok, I understand now. In that case, to make things simpler, would it possibly make sense to not load the jars to the current thread and only load them for the SparkContext/executors? Classloader stuff can be confusing to deal with and keeping it as isolated as possible could make things easier for users. This would also line up a little more with how the MR distributed cache works - jars that get added to it don't become accessible for to driver code.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---