Hmm... this seems to be particular to logging (KafkaRDD.scala:89 in my tree is a log statement). I'd expect KafkaRDD to be loaded from the system class loader - or are you repackaging it in your app?
I'd have to investigate more to come with an accurate explanation here... but it seems that the initialization of the logging system, which happens after SparkSubmit runs and sets the context class loader to be an instance of ChildFirstURLClassLoader, is causing things to blow up. I'll see if I can spend some cycles coming up with a proper explanation (and hopefully a fix or workaround). For now, you could probably avoid this by not repackaging the logging dependencies in your app. On Wed, May 20, 2015 at 5:03 AM, Sean Owen <so...@cloudera.com> wrote: > (Marcelo you might have some insight on this one) > > Warning: this may just be because I'm doing something non-standard -- > trying embed Spark in a Java app and feed it all the classpath it > needs manually. But this was surprising enough I wanted to ask. > > I have an app that includes among other things SLF4J. I have set > spark.{driver,executor}.userClassPathFirst to true. If I run it and > let it start a Spark job, it quickly fails with: > > 2015-05-20 04:35:01,747 WARN TaskSetManager:71 Lost task 0.0 in stage > 0.0 (TID 0, x.cloudera.com): java.lang.LinkageError: loader constraint > violation: loader (instance of > org/apache/spark/util/ChildFirstURLClassLoader) previously initiated > loading for a different type with name "org/slf4j/Logger" > at java.lang.ClassLoader.defineClass1(Native Method) > at java.lang.ClassLoader.defineClass(ClassLoader.java:800) > at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) > at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) > at java.net.URLClassLoader.access$100(URLClassLoader.java:71) > at java.net.URLClassLoader$1.run(URLClassLoader.java:361) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at > org.apache.spark.util.ChildFirstURLClassLoader.liftedTree1$1(MutableURLClassLoader.scala:74) > at > org.apache.spark.util.ChildFirstURLClassLoader.loadClass(MutableURLClassLoader.scala:73) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at org.apache.spark.streaming.kafka.KafkaRDD.compute(KafkaRDD.scala:89) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) > ... > > I can see that this class was loaded from my app JAR: > > [Loaded org.slf4j.Logger from > file:/home/sowen/oryx-batch-2.0.0-SNAPSHOT.jar] > > I'm assuming it's also loaded in some Spark classloader. > Tracing the code, I don't see that it ever gets to consulting any > other classloader; this happens during its own child-first attempt to > load the class. > > This didn't happen in 1.2, FWIW, when the implementation was > different, but that's only to say it was different, not correct. > > Anyone have thoughts on what this indicates? something to be expected > or surprising? > > I think that disabling userClassPathFirst gets rid of this of course, > although that may cause other issues later. > -- Marcelo