Dear All, I am struggling since more than a week on the following problem.
My Zeppelin Server is running outside the k8s cluster (there is a reason for
this) and I am able to run Spark zeppelin notes in Client mode but not in
Cluster mode.
I see that, at first, a pod for the interpreter (RemoteInterpreterServer) is
created on the cluster by spark-submit from the Zeppelin host, with
deployMode=cluster (and this happens without errors), then the interpreter
itself runs another spark-submit (this time from the Pod) with
deployMode=client.
Exactly, the following is the command line submitted by the interpreter from
its pod
/opt/spark/bin/spark-submit \
--conf spark.driver.bindAddress=<ip address of the interpreter pod> \
--deploy-mode client \
--properties-file /opt/spark/conf/spark.properties \
--class org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer \
spark-internal \
<ZEPPELIN_HOST> \
<ZEPPELIN_SERVER_RPC_PORT> \
<interpreter_name>-<user name>
At this point, the interpreter Pod remains in "Running" state, while the
Zeppelin note remains in "Pending" forever.
The log of the Interpreter (level = DEBUG) at the end only says:
INFO [2021-10-25 18:16:58,229] ({RemoteInterpreterServer-Thread}
RemoteInterpreterServer.java[run]:194) Launching ThriftServer at <ip address of
the interpreter pod>:<random port>
INFO [2021-10-25 18:16:58,229] ({RegisterThread}
RemoteInterpreterServer.java[run]:592) Start registration
INFO [2021-10-25 18:16:58,332] ({RegisterThread}
RemoteInterpreterServer.java[run]:606) Registering interpreter process
INFO [2021-10-25 18:16:58,356] ({RegisterThread}
RemoteInterpreterServer.java[run]:608) Registered interpreter process
INFO [2021-10-25 18:16:58,356] ({RegisterThread}
RemoteInterpreterServer.java[run]:629) Registration finished
(I replaced the true ip and port with a placeholder to make the log more clear
for you)
I am stuck at this point....
Anyone can help me ? Thank you in advance. Fabrizio