Hi Prabodh, Thank you for your response.
As you can see from the following JIRA issue, it is possible to run the Spark Connect Driver on Kubernetes: https://issues.apache.org/jira/browse/SPARK-45769 However, this issue describes a problem that occurs when the Driver and Executors are running on different nodes. This could potentially be the reason why only Standalone mode is currently supported, but I am not certain about it. Thank you for your attention. 2024年9月9日(月) 12:40 Prabodh Agarwal <prabodh1...@gmail.com>: > My 2 cents regarding my experience with using spark connect in cluster > mode. > > 1. Create a spark cluster of 2 or more nodes. Make 1 node as master & > other nodes as workers. Deploy spark connect pointing to the master node. > This works well. The approach is not well documented, but I could figure > it out by hit-and-trial. > 2. In k8s, by default; we can actually get the executors to run on > kubernetes itself. That is pretty straightforward, but the driver continues > to run on a local machine. But yeah, I agree as well, making the driver to > run on k8s itself would be slick. > > Thank you. > > > On Mon, Sep 9, 2024 at 6:17 AM Nagatomi Yasukazu <yassan0...@gmail.com> > wrote: > >> Hi All, >> >> Why is it not possible to specify cluster as the deploy mode for Spark >> Connect? >> >> As discussed in the following thread, it appears that there is an >> "arbitrary decision" within spark-submit that "Cluster mode is not >> applicable" to Spark Connect. >> >> GitHub Issue Comment: >> >> https://github.com/kubeflow/spark-operator/issues/1801#issuecomment-2000494607 >> >> > This will circumvent the submission error you may have gotten if you >> tried to just run the SparkConnectServer directly. From my investigation, >> that looks to be an arbitrary decision within spark-submit that Cluster >> mode is "not applicable" to SparkConnect. Which is sort of true except when >> using this operator :) >> >> I have reviewed the following commit and pull request, but I could not >> find any discussion or reason explaining why cluster mode is not available: >> >> Related Commit: >> >> https://github.com/apache/spark/commit/11260310f65e1a30f6b00b380350e414609c5fd4 >> >> Related Pull Request: >> https://github.com/apache/spark/pull/39928 >> >> This restriction poses a significant obstacle when trying to use Spark >> Connect with the Spark Operator. If there is a technical reason for this, I >> would like to know more about it. Additionally, if this issue is being >> tracked on JIRA or elsewhere, I would appreciate it if you could provide a >> link. >> >> Thank you in advance. >> >> Best regards, >> Yasukazu Nagatomi >> >