Hi Guys,
I'm having an issue in standalone mode (Spark 1.1, Hadoop 2.4, Windows Server
2008).
A very simple program runs fine in local mode but fails in standalone mode.
Here is the error:
14/11/20 17:01:53 INFO DAGScheduler: Failed to run count at SimpleApp.scala:22
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to
stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost
task
0.3 in stage 0.0 (TID 6, UK-RND-PN02.actixhost.eu):
java.lang.ClassNotFoundException: SimpleApp$$anonfun$1
java.net.URLClassLoader$1.run(URLClassLoader.java:202)
I have added the jar to the SparkConf() to be on the safe side and it appears
in standard output (copied after the code):
/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import java.net.URLClassLoader
object SimpleApp {
def main(args: Array[String]) {
val logFile = "S:\\spark-1.1.0-bin-hadoop2.4\\README.md"
val conf = new
SparkConf()//.setJars(Seq("s:\\spark\\simple\\target\\scala-2.10\\simple-project_2.10-1.0.jar"))
.setMaster("spark://UK-RND-PN02.actixhost.eu:7077")
//.setMaster("local[4]")
.setAppName("Simple Application")
val sc = new SparkContext(conf)
val cl = ClassLoader.getSystemClassLoader
val urls = cl.asInstanceOf[URLClassLoader].getURLs
urls.foreach(url => println("Executor classpath is:" + url.getFile))
val logData = sc.textFile(logFile, 2).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
sc.stop()
}
}
Simple-project is in the executor classpath list:
14/11/20 17:01:48 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready
for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
Executor classpath is:/S:/spark/simple/
Executor classpath
is:/S:/spark/simple/target/scala-2.10/simple-project_2.10-1.0.jar
Executor classpath is:/S:/spark-1.1.0-bin-hadoop2.4/conf/
Executor classpath
is:/S:/spark-1.1.0-bin-hadoop2.4/lib/spark-assembly-1.1.0-hadoop2.4.0.jar
Executor classpath is:/S:/spark/simple/
Executor classpath
is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.1.jar
Executor classpath
is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-core-3.2.2.jar
Executor classpath
is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.1.jar
Executor classpath is:/S:/spark/simple/
Would you have any idea how I could investigate further ?
Thanks !
Benoit.
PS: I could attach a debugger to the Worker where the ClassNotFoundException
happens but it is a bit painful
This message and the information contained herein is proprietary and
confidential and subject to the Amdocs policy statement,
you may review at http://www.amdocs.com/email_disclaimer.asp