Hi Deepak/Moon, After seeing the stack trace of the error and the code org.apache.zeppelin.spark.SparkInterpreter.java I think this is surely a bug in Spark Interpreter code.
The SparkInterpreter code is always calling the constructor of org.apache.spark.SparkContext to create a new Spark Context whenever the SparkInterpreter class is loaded by org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer. And hence this error. I'm not sure whether the check for yarn-cluster is newly added in SparkContext. Attaching here the complete stack trace for your ease of reference. Regards, Sourav org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit. at org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339) at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) On Mon, Oct 5, 2015 at 12:57 PM, Sourav Mazumder < [email protected]> wrote: > I could execute following without any issue. > > spark-submit --class org.apache.spark.examples.SparkPi --master > yarn-cluster --num-executors 1 --driver-memory 512m --executor-memory 512m > --executor-cores 1 lib/spark-examples.jar 10 > > Regards, > Sourav > > On Mon, Oct 5, 2015 at 12:04 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[email protected]> > wrote: > >> did you try a test job with yarn-cluster (outside zeppelin) ? >> >> On Mon, Oct 5, 2015 at 11:48 AM, Sourav Mazumder < >> [email protected]> wrote: >> >>> Yes I have them setup appropriately. >>> >>> Where I am lost is I can see that interpreter is running spark-submit >>> but at some point of time it is switching to creating a spark context. >>> >>> May be, as you rightly mentioned, because of some permission issue it is >>> not able to run driver on yarn cluster. But what is that issue/required >>> configuration I'm not able to figure out. >>> >>> Regards, >>> Sourav >>> >>> On Mon, Oct 5, 2015 at 11:38 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[email protected]> >>> wrote: >>> >>>> Do you have these settings configured in zeppelin-env.sh >>>> >>>> export JAVA_HOME=/usr/src/jdk1.7.0_79/ >>>> >>>> export HADOOP_CONF_DIR=/etc/hadoop/conf >>>> >>>> Most likely you have this as your able to run with yarn-client. >>>> >>>> >>>> Looks like the issue is to not be able to run the driver program on >>>> cluster. >>>> >>>> On Mon, Oct 5, 2015 at 11:13 AM, Sourav Mazumder < >>>> [email protected]> wrote: >>>> >>>>> Yes. Spark is installed in the machine where zeppelin is running. >>>>> >>>>> The location of spark.yarn.jar is very similar to what you have. I'm >>>>> using IOP as distribution and it is the directory naming convention >>>>> specific to IOP which is different form hdp. >>>>> >>>>> And yes the setup works perfectly fine when I use master as >>>>> yarn-client and same setup for SPARK_HOME, HADOOP_CONF_DIR and >>>>> HADOOP_CLIENT> >>>>> >>>>> Regards, >>>>> Sourav >>>>> >>>>> On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[email protected]> >>>>> wrote: >>>>> >>>>>> Is spark installed on your zeppelin machine ? >>>>>> >>>>>> I would to try these >>>>>> >>>>>> master yarn-client >>>>>> spark.home === SPARK INSTALLATION HOME directory on your zeppelin >>>>>> server. >>>>>> >>>>>> >>>>>> >>>>>> Looking at spark.yarn.jar , i see spark is installed at >>>>>> /usr/iop/current/spark-thriftserver/ . But why is it thirftserver >>>>>> (i do not know what is it). >>>>>> >>>>>> I have spark installed (unzip) on zeppelin machine at >>>>>> /usr/hdp/2.3.1.0-2574/spark/spark/ >>>>>> (can be any location) and have spark.yarn.jar to >>>>>> /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi Deepu, >>>>>>> >>>>>>> Here u go. >>>>>>> >>>>>>> Regards, >>>>>>> Sourav >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> *Properties* name value args master yarn-cluster spark.app.name Zeppelin >>>>>>> spark.cores.max spark.executor.memory 512m spark.home spark.yarn.jar >>>>>>> /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar >>>>>>> zeppelin.dep.localrepo >>>>>>> local-repo zeppelin.pyspark.python python zeppelin.spark.concurrentSQL >>>>>>> false zeppelin.spark.maxResult 1000 zeppelin.spark.useHiveContext true >>>>>>> >>>>>>> >>>>>>> On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Can you share screen shot of your spark interpreter on zeppelin web >>>>>>>> interface. >>>>>>>> >>>>>>>> I have exact same deployment structure and it runs fine with right >>>>>>>> set of configurations. >>>>>>>> >>>>>>>> On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Hi Moon, >>>>>>>>> >>>>>>>>> I'm using 0.6 SNAPSHOT which I built from latest git hub. >>>>>>>>> >>>>>>>>> I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see >>>>>>>>> that the control goes to the appropriate IF-ELSE block in >>>>>>>>> interpreter.sh by >>>>>>>>> putting some debug statement. >>>>>>>>> >>>>>>>>> But I get the same error as follows - >>>>>>>>> >>>>>>>>> org.apache.spark.SparkException: Detected yarn-cluster mode, but >>>>>>>>> isn't running on a cluster. Deployment to YARN is not supported >>>>>>>>> directly by >>>>>>>>> SparkContext. Please use spark-submit. at >>>>>>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at >>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339) >>>>>>>>> at >>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) >>>>>>>>> at >>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) >>>>>>>>> at >>>>>>>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) >>>>>>>>> at >>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) >>>>>>>>> at >>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92) >>>>>>>>> at >>>>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) >>>>>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at >>>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) >>>>>>>>> at >>>>>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at >>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) >>>>>>>>> at >>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >>>>>>>>> at >>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>>>>>>>> at >>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>>>>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>>>>> >>>>>>>>> Let me know if you need any other details to figure out what is >>>>>>>>> going on. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Sourav >>>>>>>>> >>>>>>>>> On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Which version of Zeppelin are you using? >>>>>>>>>> >>>>>>>>>> Master branch uses spark-submit command, when SPARK_HOME is >>>>>>>>>> defined in conf/zeppelin-env.sh >>>>>>>>>> >>>>>>>>>> If you're not on master branch, recommend try it with SPARK_HOME >>>>>>>>>> defined. >>>>>>>>>> >>>>>>>>>> Hope this helps, >>>>>>>>>> moon >>>>>>>>>> >>>>>>>>>> On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> When I try to run Spark Interpreter in Yarn Cluster mode from a >>>>>>>>>>> remote machine I always get the error saying try spark-submit than >>>>>>>>>>> using >>>>>>>>>>> spark context. >>>>>>>>>>> >>>>>>>>>>> Mu Zeppelin process runs in a separate machine remote to the >>>>>>>>>>> YARN cluster. >>>>>>>>>>> >>>>>>>>>>> Any idea why is this error ? >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Sourav >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Deepak >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Deepak >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Deepak >>>> >>>> >>> >> >> >> -- >> Deepak >> >> >
