This is good news. :-) Thanks for the support.
Excuse the thumb typos
On Fri, 07 Feb 2025 at 6:25 PM, Wenchen Fan wrote:
> Hi all,
>
> The vote for "Publish additional Spark distribution with Spark Connect
> enabled" passes with 22 +1s (13 binding +1s)
>
> (* = binding)
> +1:
> - Mridul Murali
Hi all,
The vote for "Publish additional Spark distribution with Spark Connect
enabled" passes with 22 +1s (13 binding +1s)
(* = binding)
+1:
- Mridul Muralidharan *
- Hyukjin Kwon *
- Jungtaek Lim
- Xiao Li *
- DB Tsai *
- Sakthi
- Gengliang Wang *
- L. C. Hsieh *
- Yang Jie *
- Max Gekk *
- Yum
unsubscribe
Yes, if this becomes a need that surfaces time and again, then it’s worthwhile to start a broader discussion in a manner of high-level proposal, which could trigger favorable discussion leading to next steps. CheersJules —Sent from my iPhonePardon the dumb thumb typos :)On Feb 7, 2025, at 8:00 AM,
Well, everything is possible. Please initiate a discussion on the matter of
a proposal to "Create a pluggable cluster manager" and put it to the
community.
See some examples here
https://lists.apache.org/list.html?dev@spark.apache.org
HTH
Dr Mich Talebzadeh,
Architect | Data Science | Financial
Agreed, If the goal is to make Spark truly pluggable, the spark-submit tool
itself should be more flexible in handling different cluster managers and
their specific requirements.
1. Back in the days, Spark's initial development focused on a limited
set of cluster managers (Standalone, YARN).
This External Cluster Manager is an amazing concept and I really like the
separation.
Would it be possible to include a broader group and discuss an approach on
how to make Spark more pluggable? It is a bit far fetched but we would be
very much interested in working on this if this resonates well
To me, this seems like a gap in the "pluggable cluster manager"
implementation.
What is the value of making cluster managers pluggable, if spark-submit
doesn't accept jobs on those cluster managers?
It seems to me, for pluggable cluster managers to work, you would want some
parts of spark-submit
Well you can try using Environment variable and create a custom script
that modifies the --master URL before invoking spark-submit. This script
could replace "k8s://" with another identifier of your choice
"k8s-armada://") and then modify the SparkSubmit code to handle this custom
URL scheme. This
Thanks for the reply Mich!
Good point, the issue is that cluster deploy mode is not possible
when master is local (
https://github.com/apache/spark/blob/9cf98ed41b2de1b44c44f0b4d1273d46761459fe/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L308
).
Only way to workaround this scenar
well that should work but some consideration
When you use
spark-submit --verbose \
--properties-file ${property_file} \
--master k8s://https://$KUBERNETES_MASTER_IP:443 \
* --deploy-mode client \*
--name sparkBQ \
*--deploy-mode client *that im
I got it to work by running it in client mode and using the `local://*`
prefix. My external cluster manager gets injected just fine.
On Fri, Feb 7, 2025 at 12:38 AM Dejan Pejchev wrote:
> Hello Spark community!
>
> My name is Dejan Pejchev, and I am a Software Engineer working at
> G-Research, a
Hi Mich,
Yes, the project is fully open-source and adopted by enterprises who do
very large scale batch scheduling and data processing.
The GitHub repository is https://github.com/armadaproject/armada and the
Armada Operator is the simplest way to install it
https://github.com/armadaproject/armad
13 matches
Mail list logo