Hello Folks, We have a strange issue going on with a spark standalone cluster in which a simple test application is having a hard time using external classes. Here are the details
The application is located here: https://github.com/prantik/spark-example We use classes such as spark's streaming twitter and twitter4j to stream the twitter hose. We use sbt to build the jar that we execute against the cluster. We have verified that the jar contains these classes. jar tf /opt/SimpleProject-assembly-1.1.jar | grep twitter4j/Status twitter4j/Status.class jar tf /opt/SimpleProject-assembly-1.1.jar | grep twitter/TwitterRec org/apache/spark/streaming/twitter/TwitterReceiver$$anon$1.class org/apache/spark/streaming/twitter/TwitterReceiver$$anonfun$onStart$1.class org/apache/spark/streaming/twitter/TwitterReceiver$$anonfun$onStop$1.class org/apache/spark/streaming/twitter/TwitterReceiver.class Also we ensure that even the slaves have class paths set to reference these libraries. ps auxww | grep spark on a slave /usr/bin/java -cp /opt/spark/external/twitter/target/spark-streaming-twitter_2.10-0.9.0-incubating.jar:/opt/spark/tools/target/spark-tools_2.10-0.9.0-incubating.jar:/opt/spark/assembly/target/scala-2.10/spark-assembly_2.10-0.9.0-incubating-hadoop2.2.0.jar However we encounter the following error when running the application the slaves 14/03/13 21:20:42 INFO Executor: Running task ID 74 14/03/13 21:20:42 ERROR Executor: Exception in task ID 74 java.lang.ClassNotFoundException: twitter4j.Status We don't know how to address the class not being found. Any ideas? Regards, Paul