Zeppelin fails to submit Spark job in Yarn Cluster mode - a Bug ?

Sourav Mazumder Thu, 08 Oct 2015 12:15:02 -0700

Hi Deepak/Moon,

After seeing the stack trace of the error and the code
org.apache.zeppelin.spark.SparkInterpreter.java I think this is surely a
bug in Spark Interpreter code.


The SparkInterpreter code is always calling the constructor of
org.apache.spark.SparkContext to create a new Spark Context whenever the
SparkInterpreter class is loaded by
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer. And hence
this error.

I'm not sure whether the check for yarn-cluster is newly added in
SparkContext.

Attaching here the complete stack trace for your ease of reference.

Regards,
Sourav

org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't
running on a cluster. Deployment to YARN is not supported directly by
SparkContext. Please use spark-submit. at
org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at
org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339)
at
org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
at
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465)
at
org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at
java.util.concurrent.FutureTask.run(Unknown Source) at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown
Source) at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source) at java.lang.Thread.run(Unknown Source)

On Mon, Oct 5, 2015 at 12:57 PM, Sourav Mazumder <
[email protected]> wrote:

> I could execute following without any issue.
>
> spark-submit --class org.apache.spark.examples.SparkPi --master
> yarn-cluster --num-executors 1 --driver-memory 512m --executor-memory 512m
> --executor-cores 1 lib/spark-examples.jar 10
>
> Regards,
> Sourav
>
> On Mon, Oct 5, 2015 at 12:04 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[email protected]>
> wrote:
>
>> did you try a test job with yarn-cluster (outside zeppelin) ?
>>
>> On Mon, Oct 5, 2015 at 11:48 AM, Sourav Mazumder <
>> [email protected]> wrote:
>>
>>> Yes I have them setup appropriately.
>>>
>>> Where I am lost is I can see that interpreter is running spark-submit
>>> but at some point of time it is switching to creating a spark context.
>>>
>>> May be, as you rightly mentioned, because of some permission issue it is
>>> not able to run driver on yarn cluster. But what is that issue/required
>>> configuration I'm not able to figure out.
>>>
>>> Regards,
>>> Sourav
>>>
>>> On Mon, Oct 5, 2015 at 11:38 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[email protected]>
>>> wrote:
>>>
>>>> Do you have these settings configured in zeppelin-env.sh
>>>>
>>>> export JAVA_HOME=/usr/src/jdk1.7.0_79/
>>>>
>>>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>>>
>>>> Most likely you have this as your able to run with yarn-client.
>>>>
>>>>
>>>> Looks like the issue is to not be able to run the driver program on
>>>> cluster.
>>>>
>>>> On Mon, Oct 5, 2015 at 11:13 AM, Sourav Mazumder <
>>>> [email protected]> wrote:
>>>>
>>>>> Yes. Spark is installed in the machine where zeppelin is running.
>>>>>
>>>>> The location of spark.yarn.jar is very similar to what you have. I'm
>>>>> using IOP as distribution and it is the directory naming convention
>>>>> specific to IOP which is different form hdp.
>>>>>
>>>>> And yes the setup works perfectly fine when I use master as
>>>>> yarn-client and same setup for SPARK_HOME, HADOOP_CONF_DIR and
>>>>> HADOOP_CLIENT>
>>>>>
>>>>> Regards,
>>>>> Sourav
>>>>>
>>>>> On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Is spark installed on your zeppelin machine ?
>>>>>>
>>>>>> I would to try these
>>>>>>
>>>>>> master yarn-client
>>>>>> spark.home === SPARK INSTALLATION HOME directory on your zeppelin
>>>>>> server.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Looking at  spark.yarn.jar , i see spark is installed at
>>>>>> /usr/iop/current/spark-thriftserver/  . But why is it thirftserver
>>>>>> (i do not know what is it).
>>>>>>
>>>>>> I have spark installed (unzip) on zeppelin machine at 
>>>>>> /usr/hdp/2.3.1.0-2574/spark/spark/
>>>>>>  (can be any location) and have spark.yarn.jar to
>>>>>> /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi Deepu,
>>>>>>>
>>>>>>> Here u go.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Sourav
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *Properties* name value args master yarn-cluster spark.app.name Zeppelin
>>>>>>> spark.cores.max spark.executor.memory 512m spark.home spark.yarn.jar
>>>>>>> /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar 
>>>>>>> zeppelin.dep.localrepo
>>>>>>> local-repo zeppelin.pyspark.python python zeppelin.spark.concurrentSQL
>>>>>>> false zeppelin.spark.maxResult 1000 zeppelin.spark.useHiveContext true
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Can you share screen shot of your spark interpreter on zeppelin web
>>>>>>>> interface.
>>>>>>>>
>>>>>>>> I have exact same deployment structure and it runs fine with right
>>>>>>>> set of configurations.
>>>>>>>>
>>>>>>>> On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hi Moon,
>>>>>>>>>
>>>>>>>>> I'm using 0.6 SNAPSHOT which I built from latest git hub.
>>>>>>>>>
>>>>>>>>> I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see
>>>>>>>>> that the control goes to the appropriate IF-ELSE block in 
>>>>>>>>> interpreter.sh by
>>>>>>>>> putting some debug statement.
>>>>>>>>>
>>>>>>>>> But I get the same error as follows -
>>>>>>>>>
>>>>>>>>> org.apache.spark.SparkException: Detected yarn-cluster mode, but
>>>>>>>>> isn't running on a cluster. Deployment to YARN is not supported 
>>>>>>>>> directly by
>>>>>>>>> SparkContext. Please use spark-submit. at
>>>>>>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at
>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339)
>>>>>>>>> at
>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
>>>>>>>>> at
>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465)
>>>>>>>>> at
>>>>>>>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
>>>>>>>>> at
>>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
>>>>>>>>> at
>>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
>>>>>>>>> at
>>>>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
>>>>>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at
>>>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
>>>>>>>>> at 
>>>>>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>>>>> at
>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>>>>> at
>>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>>>>> at
>>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>>>>>
>>>>>>>>> Let me know if you need any other details to figure out what is
>>>>>>>>> going on.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Sourav
>>>>>>>>>
>>>>>>>>> On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Which version of Zeppelin are you using?
>>>>>>>>>>
>>>>>>>>>> Master branch uses spark-submit command, when SPARK_HOME is
>>>>>>>>>> defined in conf/zeppelin-env.sh
>>>>>>>>>>
>>>>>>>>>> If you're not on master branch, recommend try it with SPARK_HOME
>>>>>>>>>> defined.
>>>>>>>>>>
>>>>>>>>>> Hope this helps,
>>>>>>>>>> moon
>>>>>>>>>>
>>>>>>>>>> On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> When I try to run Spark Interpreter in Yarn Cluster mode from a
>>>>>>>>>>> remote machine I always get the error saying try spark-submit than 
>>>>>>>>>>> using
>>>>>>>>>>> spark context.
>>>>>>>>>>>
>>>>>>>>>>> Mu Zeppelin process runs in a separate machine remote to the
>>>>>>>>>>> YARN cluster.
>>>>>>>>>>>
>>>>>>>>>>> Any idea why is this error ?
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Sourav
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Deepak
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Deepak
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Deepak
>>>>
>>>>
>>>
>>
>>
>> --
>> Deepak
>>
>>
>

Zeppelin fails to submit Spark job in Yarn Cluster mode - a Bug ?

Reply via email to