Hi Fabrizio,

At the moment I think zeppelin does not support running spark jobs in cluster mode. But in fact K8s mode simulates cluster mode. Because the Zeppelin interpreter is already started as a pod in K8s, as a manual Spark submit execution would do in cluster mode.

Spark-submit is called only once during the start of the Zeppelin interpreter. You will find the call in these lines: https://github.com/apache/zeppelin/blob/2f55fe8ed277b28d71f858633f9c9d76fd18f0c3/bin/interpreter.sh#L303-L305

Best Regards
Philipp


Am 25.10.21 um 21:58 schrieb Fabrizio Fab:
Dear All, I am struggling since more than a week on the following problem.
My Zeppelin Server is running outside the k8s cluster (there is a reason for 
this) and I am able to run Spark zeppelin notes in Client mode but not in 
Cluster mode.

I see that, at first, a pod for the interpreter (RemoteInterpreterServer) is 
created on the cluster by spark-submit from the Zeppelin host, with 
deployMode=cluster (and this happens without errors), then the interpreter 
itself runs another spark-submit  (this time from the Pod) with 
deployMode=client.

Exactly, the following is the command line submitted by the interpreter from 
its pod

/opt/spark/bin/spark-submit \
--conf spark.driver.bindAddress=<ip address of the interpreter pod> \
--deploy-mode client \
--properties-file /opt/spark/conf/spark.properties \
--class org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer \
spark-internal \
<ZEPPELIN_HOST> \
<ZEPPELIN_SERVER_RPC_PORT> \
<interpreter_name>-<user name>

At this point, the interpreter Pod remains in "Running" state, while the Zeppelin note 
remains in "Pending" forever.

The log of the Interpreter (level = DEBUG) at the end only says:
  INFO [2021-10-25 18:16:58,229] ({RemoteInterpreterServer-Thread} 
RemoteInterpreterServer.java[run]:194) Launching ThriftServer at <ip address of the 
interpreter pod>:<random port>
  INFO [2021-10-25 18:16:58,229] ({RegisterThread} 
RemoteInterpreterServer.java[run]:592) Start registration
  INFO [2021-10-25 18:16:58,332] ({RegisterThread} 
RemoteInterpreterServer.java[run]:606) Registering interpreter process
  INFO [2021-10-25 18:16:58,356] ({RegisterThread} 
RemoteInterpreterServer.java[run]:608) Registered interpreter process
  INFO [2021-10-25 18:16:58,356] ({RegisterThread} 
RemoteInterpreterServer.java[run]:629) Registration finished
(I replaced the true ip and port with a placeholder to make the log more clear 
for you)

I am stuck at this point....
Anyone can help me ? Thank you in advance. Fabrizio

Reply via email to