Hi,
Not sure whose issue this is, but if I run make-distribution using HDP
2.4.0.2.1.3.0-563 as the hadoop version (replacing it in make-distribution.sh),
I get a strange error with the exception below. If I use a slightly older
version of HDP (2.4.0.2.1.2.0-402) with make-distribution, using the generated
assembly all works fine for me. Either 1.0.0 or 1.0.1 will work fine.
Should I file a JIRA or is this a known issue?
Thanks,
Ron
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to
stage failure: Task 0.0:0 failed 1 times, most recent failure: Exception
failure in TID 0 on host localhost: java.lang.IncompatibleClassChangeError:
Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was
expected
org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyInputFormat.java:47)
org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:111)
org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99)
org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
org.apache.spark.scheduler.Task.run(Task.scala:51)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:745)