ack
On Thu, May 22, 2014 at 9:26 PM, Andrew Ash <and...@andrewash.com> wrote: > Fixing the immediate issue of requiring SPARK_HOME to be set when it's not > actually used is a separate ticket in my mind from a larger cleanup of what > SPARK_HOME means across the cluster. > > I think you should file a new ticket for just this particular issue. > > > On Thu, May 22, 2014 at 11:03 AM, Gerard Maas <gerard.m...@gmail.com>wrote: > >> Sure. Should I create a Jira as well? >> >> I saw there's already a broader ticket regarding the ambiguous use of >> SPARK_HOME [1] (cc: Patrick as owner of that ticket) >> >> I don't know if it would be more relevant to remove the use of SPARK_HOME >> when using mesos and have the assembly as the only way forward, or whether >> that's a too radical change that might break some existing systems. >> >> From a real-world ops perspective, the assembly should be the way to go. >> I don't see installing and configuring Spark distros on a mesos master as a >> way to have the mesos executor in place. >> >> -kr, Gerard. >> >> [1] https://issues.apache.org/jira/browse/SPARK-1110 >> >> >> On Thu, May 22, 2014 at 6:19 AM, Andrew Ash <and...@andrewash.com> wrote: >> >>> Hi Gerard, >>> >>> I agree that your second option seems preferred. You shouldn't have to >>> specify a SPARK_HOME if the executor is going to use the >>> spark.executor.uri >>> instead. Can you send in a pull request that includes your proposed >>> changes? >>> >>> Andrew >>> >>> >>> On Wed, May 21, 2014 at 10:19 AM, Gerard Maas <gerard.m...@gmail.com> >>> wrote: >>> >>> > Spark dev's, >>> > >>> > I was looking into a question asked on the user list where a >>> > ClassNotFoundException was thrown when running a job on Mesos. Curious >>> > issue with serialization on Mesos: more details here [1]: >>> > >>> > When trying to run that simple example on my Mesos installation, I >>> faced >>> > another issue: I got an error that "SPARK_HOME" was not set. I found >>> that >>> > curious b/c a local spark installation should not be required to run a >>> job >>> > on Mesos. All that's needed is the executor package, being the >>> > assembly.tar.gz on a reachable location (HDFS/S3/HTTP). >>> > >>> > I went looking into the code and indeed there's a check on SPARK_HOME >>> [2] >>> > regardless of the presence of the assembly but it's actually only used >>> if >>> > the assembly is not provided (which is a kind-of best-effort recovery >>> > strategy). >>> > >>> > Current flow: >>> > >>> > if (!SPARK_HOME) fail("No SPARK_HOME") >>> > else if (assembly) { use assembly) } >>> > else { try use SPARK_HOME to build spark_executor } >>> > >>> > Should be: >>> > sparkExecutor = if (assembly) {assembly} >>> > else if (SPARK_HOME) {try use SPARK_HOME to build >>> > spark_executor} >>> > else { fail("No executor found. Please provide >>> > spark.executor.uri (preferred) or spark.home") >>> > >>> > What do you think? >>> > >>> > -kr, Gerard. >>> > >>> > >>> > [1] >>> > >>> > >>> http://apache-spark-user-list.1001560.n3.nabble.com/ClassNotFoundException-with-Spark-Mesos-spark-shell-works-fine-td6165.html >>> > >>> > [2] >>> > >>> > >>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala#L89 >>> > >>> >> >> >