Spark dev's, I was looking into a question asked on the user list where a ClassNotFoundException was thrown when running a job on Mesos. Curious issue with serialization on Mesos: more details here [1]:
When trying to run that simple example on my Mesos installation, I faced another issue: I got an error that "SPARK_HOME" was not set. I found that curious b/c a local spark installation should not be required to run a job on Mesos. All that's needed is the executor package, being the assembly.tar.gz on a reachable location (HDFS/S3/HTTP). I went looking into the code and indeed there's a check on SPARK_HOME [2] regardless of the presence of the assembly but it's actually only used if the assembly is not provided (which is a kind-of best-effort recovery strategy). Current flow: if (!SPARK_HOME) fail("No SPARK_HOME") else if (assembly) { use assembly) } else { try use SPARK_HOME to build spark_executor } Should be: sparkExecutor = if (assembly) {assembly} else if (SPARK_HOME) {try use SPARK_HOME to build spark_executor} else { fail("No executor found. Please provide spark.executor.uri (preferred) or spark.home") What do you think? -kr, Gerard. [1] http://apache-spark-user-list.1001560.n3.nabble.com/ClassNotFoundException-with-Spark-Mesos-spark-shell-works-fine-td6165.html [2] https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala#L89