Since upgrading to Spark 1.4, I'm getting a scala.reflect.internal.MissingRequirementError when creating a DataFrame from an RDD. The error references a case class in the application (the RDD's type parameter), which has been verified to be present. Items of note: 1) This is running on AWS EMR (YARN). I do not get this error running locally (standalone). 2) Reverting to Spark 1.3.1 makes the problem go away 3) The jar file containing the referenced class (the app assembly jar) is not listed in the classpath expansion dumped in the error message.
I have seen SPARK-5281, and am guessing that this is the root cause, especially since the code added there is involved in the stacktrace. That said, my grasp on scala reflection isn't strong enough to make sense of the change to say for sure. It certainly looks, though, that in this scenario the current thread's context classloader may not be what we think it is (given #3 above). Any ideas? App code: def registerTable[A <: Product : TypeTag](name: String, rdd: RDD[A])(implicit hc: HiveContext) = { val df = hc.createDataFrame(rdd) df.registerTempTable(name) } Stack trace: scala.reflect.internal.MissingRequirementError: class com....MyClass in JavaMirror with sun.misc.Launcher$AppClassLoader@d16e5d6 of type class sun.misc.Launcher$AppClassLoader with classpath [ lots and lots of paths and jars, but not the app assembly jar] not found at scala.reflect.internal.MissingRequirementError$.signal(MissingRequirementError.scala:16) at scala.reflect.internal.MissingRequirementError$.notFound(MissingRequirementError.scala:17) at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:48) at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:61) at scala.reflect.internal.Mirrors$RootsBase.staticModuleOrClass(Mirrors.scala:72) at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:119) at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:21) at com.ipcoop.spark.sql.SqlEnv$$typecreator1$1.apply(SqlEnv.scala:87) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:231) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:231) at org.apache.spark.sql.catalyst.ScalaReflection$class.localTypeOf(ScalaReflection.scala:71) at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:59) at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:28) at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:410) ... app code... P.S. It looks as though I am not the only one facing this issue. A colleague ran into it independently, and has also been reported here: https://www.mail-archive.com/user@spark.apache.org/msg30302.html --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org