Ruslan Fialkovsky created ZEPPELIN-5246:
-------------------------------------------
Summary: Zeppelin in cluster mode doesn't create spark submit
Key: ZEPPELIN-5246
URL: https://issues.apache.org/jira/browse/ZEPPELIN-5246
Project: Zeppelin
Issue Type: Bug
Components: interpreter-setting, Interpreters, spark
Affects Versions: 0.9.0
Reporter: Ruslan Fialkovsky
Attachments: Screenshot 2021-02-10 at 18.30.38.png
Hello. I'm trying to configure zeppelin cluster mode and running spark on yarn.
This is my interpreter conf on the picture and it works in zeppelin single node
case.
So, seems zeppelin starts SparkContext instead of spark-submit in zeppelin
cluster mode:
{code:java}
// INFO [2021-02-10 18:34:16,838] ({Thread-1034}
ClusterInterpreterCheckThread.java[run]:51) - ClusterInterpreterCheckThread
run() >>> │
INFO [2021-02-10 18:34:16,848] ({SchedulerFactory2}
ProcessLauncher.java[transition]:109) - Process state is transitioned to
LAUNCHED │
INFO [2021-02-10 18:34:16,848] ({SchedulerFactory2}
ProcessLauncher.java[launch]:96) - Process is launched:
[/usr/lib/zeppelin/bin/interpreter.sh, -d, /usr/lib/zeppelin/interpreter/spark,
-c, 10.15.145│
.26, -p, 17317, -r, :, -i, spark-fialkovskiy, -u, fialkovskiy, -l,
/usr/lib/zeppelin/local-repo/spark, -g, spark]
│
INFO [2021-02-10 18:34:16,955] ({Exec Stream Pumper}
ProcessLauncher.java[processLine]:188) - Interpreter launch command:
/usr/lib/spark/3.0.1/bin/spark-submit --class org.apache.zeppelin.interpreter.│
remote.RemoteInterpreterServer --driver-class-path
":/usr/lib/zeppelin/interpreter/spark/*::/usr/lib/zeppelin/interpreter/zeppelin-interpreter-shaded-0.9.0-preview2.jar:/usr/lib/zeppelin/interpreter/spa│
rk/spark-interpreter-0.9.0-preview2.jar:/etc/hadoop/" --driver-java-options "
-Dfile.encoding=UTF-8
-Dlog4j.configuration='file:///etc/zeppelin/log4j.properties'
-Dlog4j.configurationFile='file:///etc/z│
eppelin/log4j2.properties'
-Dzeppelin.log.file='/usr/lib/zeppelin/logs/zeppelin-interpreter-spark-fialkovskiy-fialkovskiy--hadoop836713.log'"
/usr/lib/zeppelin/interpreter/spark/spark-interpreter-0.9.0-│
preview2.jar 10.15.145.26 17317 "spark-fialkovskiy" :+ pid=8070
│
INFO [2021-02-10 18:34:16,955] ({Exec Stream Pumper}
ProcessLauncher.java[processLine]:188) - Interpreter launch command:
/usr/lib/spark/3.0.1/bin/spark-submit --class org.apache.zeppelin.interpreter.│
remote.RemoteInterpreterServer --driver-class-path
":/usr/lib/zeppelin/interpreter/spark/*::/usr/lib/zeppelin/interpreter/zeppelin-interpreter-shaded-0.9.0-preview2.jar:/usr/lib/zeppelin/interpreter/spa│
rk/spark-interpreter-0.9.0-preview2.jar:/etc/hadoop/" --driver-java-options "
-Dfile.encoding=UTF-8
-Dlog4j.configuration='file:///etc/zeppelin/log4j.properties'
-Dlog4j.configurationFile='file:///etc/z│
eppelin/log4j2.properties'
-Dzeppelin.log.file='/usr/lib/zeppelin/logs/zeppelin-interpreter-spark-fialkovskiy-fialkovskiy--hadoop836713.log'"
/usr/lib/zeppelin/interpreter/spark/spark-interpreter-0.9.0-│
preview2.jar 10.15.145.26 17317 "spark-fialkovskiy" :+ pid=8070
│
INFO [2021-02-10 18:34:24,844] ({Thread-1034}
ClusterManager.java[getIntpProcessStatus]:455) - interpreter thrift
10.15.145.26:17305 service is online!
│
INFO [2021-02-10 18:34:24,845] ({Thread-1034}
ClusterManager.java[getIntpProcessStatus]:461) - interpreter thrift
10.15.145.26:17305 accessible!
│
INFO [2021-02-10 18:34:24,845] ({Thread-1034}
ClusterInterpreterCheckThread.java[online]:62) - Found cluster interpreter
10.15.145.26:17305
│
INFO [2021-02-10 18:34:24,851] ({Thread-1034}
ProcessLauncher.java[transition]:109) - Process state is transitioned to
RUNNING
│
INFO [2021-02-10 18:34:24,852] ({Thread-1034}
ClusterInterpreterCheckThread.java[run]:81) - ClusterInterpreterCheckThread
run() <<< │
INFO [2021-02-10 18:34:24,854] ({SchedulerFactory2}
ClusterManager.java[getIntpProcessStatus]:455) - interpreter thrift
10.15.145.26:17305 service is online!
│
INFO [2021-02-10 18:34:24,854] ({SchedulerFactory2}
ClusterManager.java[getIntpProcessStatus]:461) - interpreter thrift
10.15.145.26:17305 accessible!
│
INFO [2021-02-10 18:34:24,861] ({SchedulerFactory2}
RemoteInterpreter.java[lambda$internal_create$1]:166) - Create
RemoteInterpreter org.apache.zeppelin.spark.SparkInterpreter
│
INFO [2021-02-10 18:34:24,960] ({SchedulerFactory2}
RemoteInterpreter.java[lambda$internal_create$1]:166) - Create
RemoteInterpreter org.apache.zeppelin.spark.SparkSqlInterpreter
│
INFO [2021-02-10 18:34:24,962] ({SchedulerFactory2}
RemoteInterpreter.java[lambda$internal_create$1]:166) - Create
RemoteInterpreter org.apache.zeppelin.spark.PySparkInterpreter
│
INFO [2021-02-10 18:34:24,966] ({SchedulerFactory2}
RemoteInterpreter.java[lambda$internal_create$1]:166) - Create
RemoteInterpreter org.apache.zeppelin.spark.IPySparkInterpreter
│
INFO [2021-02-10 18:34:24,970] ({SchedulerFactory2}
RemoteInterpreter.java[lambda$internal_create$1]:166) - Create
RemoteInterpreter org.apache.zeppelin.spark.SparkRInterpreter
│
INFO [2021-02-10 18:34:24,972] ({SchedulerFactory2}
RemoteInterpreter.java[lambda$internal_create$1]:166) - Create
RemoteInterpreter org.apache.zeppelin.spark.SparkIRInterpreter
│
INFO [2021-02-10 18:34:24,974] ({SchedulerFactory2}
RemoteInterpreter.java[lambda$internal_create$1]:166) - Create
RemoteInterpreter org.apache.zeppelin.spark.SparkShinyInterpreter
│
INFO [2021-02-10 18:34:24,975] ({SchedulerFactory2}
RemoteInterpreter.java[lambda$internal_create$1]:166) - Create
RemoteInterpreter org.apache.zeppelin.spark.KotlinSparkInterpreter
│
INFO [2021-02-10 18:34:25,050] ({SchedulerFactory2}
RemoteInterpreter.java[lambda$open$0]:139) - Open RemoteInterpreter
org.apache.zeppelin.spark.PySparkInterpreter
│
INFO [2021-02-10 18:34:25,050] ({SchedulerFactory2}
RemoteInterpreter.java[pushAngularObjectRegistryToRemote]:408) - Push local
angular object registry from ZeppelinServer to remote interpreter group s│
park-fialkovskiy
│
INFO [2021-02-10 18:34:25,128]
({JobStatusPoller-paragraph_1612823207233_789036296}
NotebookServer.java[onStatusChange]:1907) - Job
paragraph_1612823207233_789036296 starts to RUNNING │
WARN [2021-02-10 18:34:28,469] ({SchedulerFactory2}
NotebookServer.java[onStatusChange]:1904) - Job
paragraph_1612823207233_789036296 is finished, status: ERROR, exception: null,
result: %text org.apac│
he.zeppelin.interpreter.InterpreterException:
org.apache.zeppelin.interpreter.InterpreterException:
org.apache.zeppelin.interpreter.InterpreterException: Fail to open
SparkInterpreter
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:76)
│
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:760)
│
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:668)
│
at org.apache.zeppelin.scheduler.Job.run(Job.java:172)
│
at
org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:130)
│
at
org.apache.zeppelin.scheduler.FIFOScheduler.lambda$runJobInScheduler$0(FIFOScheduler.java:39)
│
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
│
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
│
at java.lang.Thread.run(Thread.java:748)
│
Caused by: org.apache.zeppelin.interpreter.InterpreterException:
org.apache.zeppelin.interpreter.InterpreterException: Fail to open
SparkInterpreter │
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:76)
│
at
org.apache.zeppelin.interpreter.Interpreter.getInterpreterInTheSameSessionByClassName(Interpreter.java:355)
│
at
org.apache.zeppelin.interpreter.Interpreter.getInterpreterInTheSameSessionByClassName(Interpreter.java:366)
│
at
org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:89)
│
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
│
... 8 more
│
Caused by: org.apache.zeppelin.interpreter.InterpreterException: Fail to open
SparkInterpreter
│
at
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:122)
│
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
│
... 12 more
│
Caused by: java.lang.reflect.InvocationTargetException
│
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
│
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
│
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
│
at java.lang.reflect.Method.invoke(Method.java:498)
│
at
org.apache.zeppelin.spark.BaseSparkScalaInterpreter.spark2CreateContext(BaseSparkScalaInterpreter.scala:301)
│
at
org.apache.zeppelin.spark.BaseSparkScalaInterpreter.createSparkContext(BaseSparkScalaInterpreter.scala:230)
│
at
org.apache.zeppelin.spark.SparkScala212Interpreter.open(SparkScala212Interpreter.scala:90)
│
at
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:106)
│
... 13 more
│
Caused by: org.apache.spark.SparkException: Detected yarn cluster mode, but
isn't running on a cluster. Deployment to YARN is not supported directly by
SparkContext. Please use spark-submit. │
at org.apache.spark.SparkContext.<init>(SparkContext.scala:402)
│
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2574)
│
at
org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:934)
│
at scala.Option.getOrElse(Option.scala:189)
│
at
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:928)
│
... 21 more
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)