I also think the code is not robust enough. First, Spark works with hadoop1,
why the code try hadoop2 first. Also the following code only handle
ClassNotFoundException. It should handle all the exceptions.
private def firstAvailableClass(first: String, second: String): Class[_] = {
try { Class.forName(first) } catch { case e:
ClassNotFoundException => Class.forName(second) } }
From: [email protected]
To: [email protected]
CC: [email protected]
Subject: RE: spark 1.1.0 save data to hdfs failed
Date: Fri, 23 Jan 2015 06:43:00 -0800
I looked into the source code of SparkHadoopMapReduceUtil.scala. I think it is
broken in the following code:
def newTaskAttemptContext(conf: Configuration, attemptId: TaskAttemptID):
TaskAttemptContext = { val klass = firstAvailableClass(
"org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl", // hadoop2,
hadoop2-yarn "org.apache.hadoop.mapreduce.TaskAttemptContext")
// hadoop1 val ctor = klass.getDeclaredConstructor(classOf[Configuration],
classOf[TaskAttemptID]) ctor.newInstance(conf,
attemptId).asInstanceOf[TaskAttemptContext] }
In other words, it is related to hadoop2, hadoop2-yarn, and hadoop1. Any
suggestion how to resolve it?
Thanks.
> From: [email protected]
> Date: Fri, 23 Jan 2015 14:01:45 +0000
> Subject: Re: spark 1.1.0 save data to hdfs failed
> To: [email protected]
> CC: [email protected]
>
> These are all definitely symptoms of mixing incompatible versions of
> libraries.
>
> I'm not suggesting you haven't excluded Spark / Hadoop, but, this is
> not the only way Hadoop deps get into your app. See my suggestion
> about investigating the dependency tree.
>
> On Fri, Jan 23, 2015 at 1:53 PM, ey-chih chow <[email protected]> wrote:
> > Thanks. But I think I already mark all the Spark and Hadoop reps as
> > provided. Why the cluster's version is not used?
> >
> > Any way, as I mentioned in the previous message, after changing the
> > hadoop-client to version 1.2.1 in my maven deps, I already pass the
> > exception and go to another one as indicated below. Any suggestion on this?
> >
> > =================================
> >
> > Exception in thread "main" java.lang.reflect.InvocationTargetException
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > at
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:606)
> > at
> > org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:40)
> > at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
> > Caused by: java.lang.IncompatibleClassChangeError: Implementing class
> > at java.lang.ClassLoader.defineClass1(Native Method)
> > at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
> > at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> > at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
> > at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> > at java.lang.Class.forName0(Native Method)
> > at java.lang.Class.forName(Class.java:191)
> > at
> > org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.firstAvailableClass(SparkHadoopMapReduceUtil.scala:73)
> > at
> > org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.newTaskAttemptContext(SparkHadoopMapReduceUtil.scala:35)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.newTaskAttemptContext(PairRDDFunctions.scala:53)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:932)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:832)
> > at com.crowdstar.etl.ParseAndClean$.main(ParseAndClean.scala:103)
> > at com.crowdstar.etl.ParseAndClean.main(ParseAndClean.scala)
> >
> > ... 6 more
> >