Thanks.  I can reach out to Cloudera, although the same commands seem to be
work via Spak-Shell (see below).  So, the issue seems unique to Zeppelin.

Spark context available as 'sc' (master = yarn, app id =
application_1472496315722_481416).

Spark session available as 'spark'.

Welcome to

      ____              __

     / __/__  ___ _____/ /__

    _\ \/ _ \/ _ `/ __/  '_/

   /___/ .__/\_,_/_/ /_/\_\   version 2.0.0.cloudera1

      /_/



Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java
1.8.0_60)

Type in expressions to have them evaluated.

Type :help for more information.


scala> val taxonomy = sc.textFile("/user/user1/data/")

taxonomy: org.apache.spark.rdd.RDD[String] = /user/user1/data/
MapPartitionsRDD[1]
at textFile at <console>:24


scala> .map(l => l.split("\t"))

res0: org.apache.spark.rdd.RDD[Array[String]] = MapPartitionsRDD[2] at map
at <console>:27


scala> taxonomy.first

res1: String = 43 B&B 459 Sheets & Pillow 45 Sheets 1 Sheets

On Mon, Mar 6, 2017 at 6:48 PM, moon soo Lee <m...@apache.org> wrote:

> Hi Rob,
>
> Thanks for sharing the problem.
> fyi, https://issues.apache.org/jira/browse/ZEPPELIN-1735 is tracking the
> problem.
>
> If we can get help from cloudera forum, that would be great.
>
> Thanks,
> moon
>
> On Tue, Mar 7, 2017 at 10:08 AM Jeff Zhang <zjf...@gmail.com> wrote:
>
>>
>> It seems CDH specific issue, you might be better to ask cloudera forum.
>>
>>
>> Rob Anderson <rockclimbings...@gmail.com>于2017年3月7日周二 上午9:02写道:
>>
>> Hey Everyone,
>>
>> We're running Zeppelin 0.7.0.  We've just cut over to spark2, using
>> scala11 via the CDH parcel (SPARK2-2.0.0.cloudera1-1.cdh5.7.0.p0.113931).
>>
>> Running a simple job, throws a "Caused by: java.lang.ClassNotFoundException:
>> $anonfun$1".  It appears that  during execution time on the yarn hosts,
>> the native CDH spark1.5 jars are loaded before the new spark2 jars.  I've
>> tried using spark.yarn.archive to specify the spark2 jars in hdfs as well
>> as using other spark options, none of which seems to make a difference.
>>
>>
>> Any suggestions you can offer is appreciated.
>>
>> Thanks,
>>
>> Rob
>>
>> ------------------------
>>
>>
>> %spark
>> val taxonomy = sc.textFile("/user/user1/data/")
>>                  .map(l => l.split("\t"))
>>
>> %spark
>> taxonomy.first
>>
>>
>> org.apache.spark.SparkException: Job aborted due to stage failure: Task
>> 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage
>> 1.0 (TID 7, data08.hadoop.prod.ostk.com, executor 2): 
>> java.lang.ClassNotFoundException:
>> $anonfun$1
>> at org.apache.spark.repl.ExecutorClassLoader.findClass(
>> ExecutorClassLoader.scala:82)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> at java.lang.Class.forName0(Native Method)
>> at java.lang.Class.forName(Class.java:348)
>> at org.apache.spark.serializer.JavaDeserializationStream$$
>> anon$1.resolveClass(JavaSerializer.scala:67)
>> at java.io.ObjectInputStream.readNonProxyDesc(
>> ObjectInputStream.java:1613)
>> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>> at java.io.ObjectInputStream.readOrdinaryObject(
>> ObjectInputStream.java:1774)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>> at java.io.ObjectInputStream.defaultReadFields(
>> ObjectInputStream.java:2000)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>> at java.io.ObjectInputStream.readOrdinaryObject(
>> ObjectInputStream.java:1801)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>> at java.io.ObjectInputStream.defaultReadFields(
>> ObjectInputStream.java:2000)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>> at java.io.ObjectInputStream.readOrdinaryObject(
>> ObjectInputStream.java:1801)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>> at java.io.ObjectInputStream.defaultReadFields(
>> ObjectInputStream.java:2000)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>> at java.io.ObjectInputStream.readOrdinaryObject(
>> ObjectInputStream.java:1801)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>> at org.apache.spark.serializer.JavaDeserializationStream.
>> readObject(JavaSerializer.scala:75)
>> at org.apache.spark.serializer.JavaSerializerInstance.
>> deserialize(JavaSerializer.scala:114)
>> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>> at org.apache.spark.scheduler.Task.run(Task.scala:86)
>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1142)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:617)
>> at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.lang.ClassNotFoundException: $anonfun$1
>> at java.lang.ClassLoader.findClass(ClassLoader.java:530)
>> at org.apache.spark.util.ParentClassLoader.findClass(
>> ParentClassLoader.scala:26)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at org.apache.spark.util.ParentClassLoader.loadClass(
>> ParentClassLoader.scala:34)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> at org.apache.spark.util.ParentClassLoader.loadClass(
>> ParentClassLoader.scala:30)
>> at org.apache.spark.repl.ExecutorClassLoader.findClass(
>> ExecutorClassLoader.scala:77)
>> ... 30 more
>> Driver stacktrace:
>> at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$
>> scheduler$DAGScheduler$$failJobAndIndependentStages(
>> DAGScheduler.scala:1454)
>> at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(
>> DAGScheduler.scala:1442)
>> at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(
>> DAGScheduler.scala:1441)
>> at scala.collection.mutable.ResizableArray$class.foreach(
>> ResizableArray.scala:59)
>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>> at org.apache.spark.scheduler.DAGScheduler.abortStage(
>> DAGScheduler.scala:1441)
>> at org.apache.spark.scheduler.DAGScheduler$$anonfun$
>> handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
>> at org.apache.spark.scheduler.DAGScheduler$$anonfun$
>> handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
>> at scala.Option.foreach(Option.scala:257)
>> at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(
>> DAGScheduler.scala:811)
>> at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.
>> doOnReceive(DAGScheduler.scala:1669)
>> at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.
>> onReceive(DAGScheduler.scala:1624)
>> at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.
>> onReceive(DAGScheduler.scala:1613)
>> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>> at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632)
>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1893)
>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1906)
>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1919)
>> at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1318)
>> at org.apache.spark.rdd.RDDOperationScope$.withScope(
>> RDDOperationScope.scala:151)
>> at org.apache.spark.rdd.RDDOperationScope$.withScope(
>> RDDOperationScope.scala:112)
>> at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
>> at org.apache.spark.rdd.RDD.take(RDD.scala:1292)
>> at org.apache.spark.rdd.RDD$$anonfun$first$1.apply(RDD.scala:1332)
>> at org.apache.spark.rdd.RDDOperationScope$.withScope(
>> RDDOperationScope.scala:151)
>> at org.apache.spark.rdd.RDDOperationScope$.withScope(
>> RDDOperationScope.scala:112)
>> at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
>> at org.apache.spark.rdd.RDD.first(RDD.scala:1331)
>> ... 37 elided
>> Caused by: java.lang.ClassNotFoundException: $anonfun$1
>> at org.apache.spark.repl.ExecutorClassLoader.findClass(
>> ExecutorClassLoader.scala:82)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> at java.lang.Class.forName0(Native Method)
>> at java.lang.Class.forName(Class.java:348)
>> at org.apache.spark.serializer.JavaDeserializationStream$$
>> anon$1.resolveClass(JavaSerializer.scala:67)
>> at java.io.ObjectInputStream.readNonProxyDesc(
>> ObjectInputStream.java:1613)
>> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>> at java.io.ObjectInputStream.readOrdinaryObject(
>> ObjectInputStream.java:1774)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>> at java.io.ObjectInputStream.defaultReadFields(
>> ObjectInputStream.java:2000)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>> at java.io.ObjectInputStream.readOrdinaryObject(
>> ObjectInputStream.java:1801)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>> at java.io.ObjectInputStream.defaultReadFields(
>> ObjectInputStream.java:2000)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>> at java.io.ObjectInputStream.readOrdinaryObject(
>> ObjectInputStream.java:1801)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>> at java.io.ObjectInputStream.defaultReadFields(
>> ObjectInputStream.java:2000)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>> at java.io.ObjectInputStream.readOrdinaryObject(
>> ObjectInputStream.java:1801)
>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>> at org.apache.spark.serializer.JavaDeserializationStream.
>> readObject(JavaSerializer.scala:75)
>> at org.apache.spark.serializer.JavaSerializerInstance.
>> deserialize(JavaSerializer.scala:114)
>> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>> at org.apache.spark.scheduler.Task.run(Task.scala:86)
>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1142)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:617)
>> ... 1 more
>> Caused by: java.lang.ClassNotFoundException: $anonfun$1
>> at java.lang.ClassLoader.findClass(ClassLoader.java:530)
>> at org.apache.spark.util.ParentClassLoader.findClass(
>> ParentClassLoader.scala:26)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at org.apache.spark.util.ParentClassLoader.loadClass(
>> ParentClassLoader.scala:34)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> at org.apache.spark.util.ParentClassLoader.loadClass(
>> ParentClassLoader.scala:30)
>> at org.apache.spark.repl.ExecutorClassL
>> oader.findClass(ExecutorClassLoader.scala:77)
>> ... 30 more
>>
>>

Reply via email to