Hi Fabrizio,
At the moment I think zeppelin does not support running spark jobs in
cluster mode. But in fact K8s mode simulates cluster mode. Because the
Zeppelin interpreter is already started as a pod in K8s, as a manual
Spark submit execution would do in cluster mode.
Spark-submit is called only once during the start of the Zeppelin
interpreter. You will find the call in these lines:
https://github.com/apache/zeppelin/blob/2f55fe8ed277b28d71f858633f9c9d76fd18f0c3/bin/interpreter.sh#L303-L305
Best Regards
Philipp
Am 25.10.21 um 21:58 schrieb Fabrizio Fab:
Dear All, I am struggling since more than a week on the following problem.
My Zeppelin Server is running outside the k8s cluster (there is a reason for
this) and I am able to run Spark zeppelin notes in Client mode but not in
Cluster mode.
I see that, at first, a pod for the interpreter (RemoteInterpreterServer) is
created on the cluster by spark-submit from the Zeppelin host, with
deployMode=cluster (and this happens without errors), then the interpreter
itself runs another spark-submit (this time from the Pod) with
deployMode=client.
Exactly, the following is the command line submitted by the interpreter from
its pod
/opt/spark/bin/spark-submit \
--conf spark.driver.bindAddress=<ip address of the interpreter pod> \
--deploy-mode client \
--properties-file /opt/spark/conf/spark.properties \
--class org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer \
spark-internal \
<ZEPPELIN_HOST> \
<ZEPPELIN_SERVER_RPC_PORT> \
<interpreter_name>-<user name>
At this point, the interpreter Pod remains in "Running" state, while the Zeppelin note
remains in "Pending" forever.
The log of the Interpreter (level = DEBUG) at the end only says:
INFO [2021-10-25 18:16:58,229] ({RemoteInterpreterServer-Thread}
RemoteInterpreterServer.java[run]:194) Launching ThriftServer at <ip address of the
interpreter pod>:<random port>
INFO [2021-10-25 18:16:58,229] ({RegisterThread}
RemoteInterpreterServer.java[run]:592) Start registration
INFO [2021-10-25 18:16:58,332] ({RegisterThread}
RemoteInterpreterServer.java[run]:606) Registering interpreter process
INFO [2021-10-25 18:16:58,356] ({RegisterThread}
RemoteInterpreterServer.java[run]:608) Registered interpreter process
INFO [2021-10-25 18:16:58,356] ({RegisterThread}
RemoteInterpreterServer.java[run]:629) Registration finished
(I replaced the true ip and port with a placeholder to make the log more clear
for you)
I am stuck at this point....
Anyone can help me ? Thank you in advance. Fabrizio