Hi, I have set up a cluster with Mesos (backed by Zookeeper) with three master and three slave instances. I set up Spark (git HEAD) for use with Mesos according to this manual: http://people.apache.org/~pwendell/catalyst-docs/running-on-mesos.html
Using the spark-shell, I can connect to this cluster and do simple RDD operations, but the same code in a Scala class and executed via sbt run-main works only partially. (That is, count() works, count() after flatMap() does not.) Here is my code: https://gist.github.com/tgpfeiffer/7d20a4d59ee6e0088f91 The file SparkExamplesScript.scala, when pasted into spark-shell, outputs the correct count() for the parallelized list comprehension, as well as for the flatMapped RDD. The file SparkExamplesMinimal.scala contains exactly the same code, and also the MASTER configuration and the Spark Executor are the same. However, while the count() for the parallelized list is displayed correctly, I receive the following error when asking for the count() of the flatMapped RDD: ----------------- 14/05/21 09:47:49 INFO scheduler.DAGScheduler: Submitting Stage 1 (FlatMappedRDD[1] at flatMap at SparkExamplesMinimal.scala:34), which has no missing parents 14/05/21 09:47:49 INFO scheduler.DAGScheduler: Submitting 8 missing tasks from Stage 1 (FlatMappedRDD[1] at flatMap at SparkExamplesMinimal.scala:34) 14/05/21 09:47:49 INFO scheduler.TaskSchedulerImpl: Adding task set 1.0 with 8 tasks 14/05/21 09:47:49 INFO scheduler.TaskSetManager: Starting task 1.0:0 as TID 8 on executor 20140520-102159-2154735808-5050-1108-1: mesos9-1 (PROCESS_LOCAL) 14/05/21 09:47:49 INFO scheduler.TaskSetManager: Serialized task 1.0:0 as 1779147 bytes in 37 ms 14/05/21 09:47:49 WARN scheduler.TaskSetManager: Lost TID 8 (task 1.0:0) 14/05/21 09:47:49 WARN scheduler.TaskSetManager: Loss was due to java.lang.ClassNotFoundException java.lang.ClassNotFoundException: spark.SparkExamplesMinimal$$anonfun$2 at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:60) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63) at org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:61) at org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:141) at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:85) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:169) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ----------------- Can anyone explain to me where this comes from or how I might further track the problem down? Thanks, Tobias