Re: Should SPARK_HOME be needed with Mesos?

Gerard Maas Thu, 22 May 2014 16:01:33 -0700

ack


On Thu, May 22, 2014 at 9:26 PM, Andrew Ash <and...@andrewash.com> wrote:

> Fixing the immediate issue of requiring SPARK_HOME to be set when it's not
> actually used is a separate ticket in my mind from a larger cleanup of what
> SPARK_HOME means across the cluster.
>
> I think you should file a new ticket for just this particular issue.
>
>
> On Thu, May 22, 2014 at 11:03 AM, Gerard Maas <gerard.m...@gmail.com>wrote:
>
>> Sure.  Should I create a Jira as well?
>>
>> I saw there's already a broader ticket regarding the ambiguous use of
>> SPARK_HOME [1]  (cc: Patrick as owner of that ticket)
>>
>> I don't know if it would be more relevant to remove the use of SPARK_HOME
>> when using mesos and have the assembly as the only way forward, or whether
>> that's a too radical change that might break some existing systems.
>>
>> From a real-world ops perspective, the assembly should be the way to go.
>> I don't see installing and configuring Spark distros on a mesos master as a
>> way to have the mesos executor in place.
>>
>> -kr, Gerard.
>>
>> [1] https://issues.apache.org/jira/browse/SPARK-1110
>>
>>
>> On Thu, May 22, 2014 at 6:19 AM, Andrew Ash <and...@andrewash.com> wrote:
>>
>>> Hi Gerard,
>>>
>>> I agree that your second option seems preferred.  You shouldn't have to
>>> specify a SPARK_HOME if the executor is going to use the
>>> spark.executor.uri
>>> instead.  Can you send in a pull request that includes your proposed
>>> changes?
>>>
>>> Andrew
>>>
>>>
>>> On Wed, May 21, 2014 at 10:19 AM, Gerard Maas <gerard.m...@gmail.com>
>>> wrote:
>>>
>>> > Spark dev's,
>>> >
>>> > I was looking into a question asked on the user list where a
>>> > ClassNotFoundException was thrown when running a job on Mesos. Curious
>>> > issue with serialization on Mesos: more details here [1]:
>>> >
>>> > When trying to run that simple example on my Mesos installation, I
>>> faced
>>> > another issue: I got an error that "SPARK_HOME" was not set. I found
>>> that
>>> > curious b/c a local spark installation should not be required to run a
>>> job
>>> > on Mesos. All that's needed is the executor package, being the
>>> > assembly.tar.gz on a reachable location (HDFS/S3/HTTP).
>>> >
>>> > I went looking into the code and indeed there's a check on SPARK_HOME
>>> [2]
>>> > regardless of the presence of the assembly but it's actually only used
>>> if
>>> > the assembly is not provided (which is a kind-of best-effort recovery
>>> > strategy).
>>> >
>>> > Current flow:
>>> >
>>> > if (!SPARK_HOME) fail("No SPARK_HOME")
>>> > else if (assembly) { use assembly) }
>>> > else { try use SPARK_HOME to build spark_executor }
>>> >
>>> > Should be:
>>> > sparkExecutor =  if (assembly) {assembly}
>>> >                  else if (SPARK_HOME) {try use SPARK_HOME to build
>>> > spark_executor}
>>> >                  else { fail("No executor found. Please provide
>>> > spark.executor.uri (preferred) or spark.home")
>>> >
>>> > What do you think?
>>> >
>>> > -kr, Gerard.
>>> >
>>> >
>>> > [1]
>>> >
>>> >
>>> http://apache-spark-user-list.1001560.n3.nabble.com/ClassNotFoundException-with-Spark-Mesos-spark-shell-works-fine-td6165.html
>>> >
>>> > [2]
>>> >
>>> >
>>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala#L89
>>> >
>>>
>>
>>
>

Re: Should SPARK_HOME be needed with Mesos?

Reply via email to