Gerard, thanks very much for your investigation! After hours of trial and error, I am kind of happy to hear it is not just a broken setup on my side that's causing the error.
Could you explain briefly how you created that simple jar file? Thanks, Tobias On Wed, May 21, 2014 at 9:47 PM, Gerard Maas <gerard.m...@gmail.com> wrote: > Hi Tobias, > > I was curious about this issue and tried to run your example on my local > Mesos. I was able to reproduce your issue using your current config: > > [error] (run-main-0) org.apache.spark.SparkException: Job aborted: Task > 1.0:4 failed 4 times (most recent failure: Exception failure: > java.lang.ClassNotFoundException: spark.SparkExamplesMinimal$$anonfun$2) > org.apache.spark.SparkException: Job aborted: Task 1.0:4 failed 4 times > (most recent failure: Exception failure: java.lang.ClassNotFoundException: > spark.SparkExamplesMinimal$$anonfun$2) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028) > > Creating a simple jar from the job and providing it through the > configuration seems to solve it: > > val conf = new SparkConf() > .setMaster("mesos://<my_ip>:5050/") > > .setJars(Seq("/sparkexample/target/scala-2.10/sparkexample_2.10-0.1.jar")) > .setAppName("SparkExamplesMinimal") > > Resulting in: > 14/05/21 12:03:45 INFO scheduler.DAGScheduler: Completed ResultTask(1, 1) > 14/05/21 12:03:45 INFO scheduler.DAGScheduler: Stage 1 (count at > SparkExamplesMinimal.scala:50) finished in 1.120 s > 14/05/21 12:03:45 INFO spark.SparkContext: Job finished: count at > SparkExamplesMinimal.scala:50, took 1.177091435 s > count: 1000000 > > Why the closure serialization does not work with Mesos is beyond my current > knowledge. > Would be great to hear from the experts (cross-posting to dev for that) > > -kr, Gerard. > > > > > > > > > > > > > > On Wed, May 21, 2014 at 11:51 AM, Tobias Pfeiffer <t...@preferred.jp> wrote: >> >> Hi, >> >> I have set up a cluster with Mesos (backed by Zookeeper) with three >> master and three slave instances. I set up Spark (git HEAD) for use >> with Mesos according to this manual: >> http://people.apache.org/~pwendell/catalyst-docs/running-on-mesos.html >> >> Using the spark-shell, I can connect to this cluster and do simple RDD >> operations, but the same code in a Scala class and executed via sbt >> run-main works only partially. (That is, count() works, count() after >> flatMap() does not.) >> >> Here is my code: https://gist.github.com/tgpfeiffer/7d20a4d59ee6e0088f91 >> The file SparkExamplesScript.scala, when pasted into spark-shell, >> outputs the correct count() for the parallelized list comprehension, >> as well as for the flatMapped RDD. >> >> The file SparkExamplesMinimal.scala contains exactly the same code, >> and also the MASTER configuration and the Spark Executor are the same. >> However, while the count() for the parallelized list is displayed >> correctly, I receive the following error when asking for the count() >> of the flatMapped RDD: >> >> ----------------- >> >> 14/05/21 09:47:49 INFO scheduler.DAGScheduler: Submitting Stage 1 >> (FlatMappedRDD[1] at flatMap at SparkExamplesMinimal.scala:34), which >> has no missing parents >> 14/05/21 09:47:49 INFO scheduler.DAGScheduler: Submitting 8 missing >> tasks from Stage 1 (FlatMappedRDD[1] at flatMap at >> SparkExamplesMinimal.scala:34) >> 14/05/21 09:47:49 INFO scheduler.TaskSchedulerImpl: Adding task set >> 1.0 with 8 tasks >> 14/05/21 09:47:49 INFO scheduler.TaskSetManager: Starting task 1.0:0 >> as TID 8 on executor 20140520-102159-2154735808-5050-1108-1: mesos9-1 >> (PROCESS_LOCAL) >> 14/05/21 09:47:49 INFO scheduler.TaskSetManager: Serialized task 1.0:0 >> as 1779147 bytes in 37 ms >> 14/05/21 09:47:49 WARN scheduler.TaskSetManager: Lost TID 8 (task 1.0:0) >> 14/05/21 09:47:49 WARN scheduler.TaskSetManager: Loss was due to >> java.lang.ClassNotFoundException >> java.lang.ClassNotFoundException: spark.SparkExamplesMinimal$$anonfun$2 >> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >> at java.lang.Class.forName0(Native Method) >> at java.lang.Class.forName(Class.java:270) >> at >> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:60) >> at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612) >> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> at >> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) >> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) >> at >> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63) >> at >> org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:61) >> at >> org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:141) >> at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837) >> at >> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) >> at >> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63) >> at >> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:85) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:169) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> >> ----------------- >> >> Can anyone explain to me where this comes from or how I might further >> track the problem down? >> >> Thanks, >> Tobias > >