Yesha Vora created ZEPPELIN-1083:
------------------------------------

             Summary: "%spark" interpreter won't work with new zeppelin 
notebook if livy server is installed
                 Key: ZEPPELIN-1083
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1083
             Project: Zeppelin
          Issue Type: Bug
            Reporter: Yesha Vora


If Livy server is installed on the cluster, Zeppelin server gets yarn-cluster 
as master.

When a new Zeppelin notebook is created with below statements, notebook fails 
to execute below error message.
 {code}
%spark
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
{code}

{code}
ERROR [2016-06-28 22:21:45,386] ({pool-2-thread-2} Logging.scala[logError]:95) 
- Error initializing SparkContext.
org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running 
on a cluster. Deployment to YARN is not supported directly by SparkContext. 
Please use spark-submit.
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:411)
        at 
org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:334)
        at 
org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:122)
        at 
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:509)
        at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
        at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
        at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:341)
        at org.apache.zeppelin.scheduler.Job.run(Job.java:176)
        at 
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
 INFO [2016-06-28 22:21:45,398] ({pool-2-thread-2} Logging.scala[logInfo]:58) - 
Successfully stopped SparkContext
ERROR [2016-06-28 22:21:45,398] ({pool-2-thread-2} Job.java[run]:189) - Job 
failed
org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running 
on a cluster. Deployment to YARN is not supported directly by SparkContext. 
Please use spark-submit.
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:411)
        at 
org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:334)
        at 
org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:122)
        at 
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:509)
        at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
        at 
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
        at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:341)
        at org.apache.zeppelin.scheduler.Job.run(Job.java:176)
        at 
org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615){code}

I believe that if %spark is used , it should use spark interpreter and the 
paragraph should execute in yarn-client. In this case, if Livy server is 
installed on cluster, the paragraph tries to run in yarn-cluster mode and it 
fails. The paragraph should use yarn-cluster mode only if %livy is used in 
paragraph.

we will need to define the default order of these interpreters so that %livy 
and %spark both can run fine. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to