Re: Spark on Kubernetes scheduler variety

2021-07-08 Thread Mich Talebzadeh
Splendid. Please invite me to the next meeting mich.talebza...@gmail.com Timezone London, UK *GMT+1* Thanks, view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or des

Re: Spark on Kubernetes scheduler variety

2021-07-08 Thread Holden Karau
Hi Y'all, We had an initial meeting which went well, got some more context around Volcano and its near-term roadmap. Talked about the impact around scheduler deadlocking and some ways that we could potentially improve integration from the Spark side and Volcano sides respectively. I'm going to sta

Re: Spark on Kubernetes scheduler variety

2021-07-01 Thread Mich Talebzadeh
Thanks. I also have a three node cluster in my lab running Red Hat 7.6 with 64GB of RAM etc. However, I doubt whether minikube will be useful. If we can get a Google Kubernetes Engine (GKE) cluster (which is a fully managed service) from Google on a loan

Re: Spark on Kubernetes scheduler variety

2021-07-01 Thread Holden Karau
I do my own dev work on a personal cluster I have down in Fremont which I’ve got setup using k3sup. I know some devs use minikube (and our integration tests can). But yeah if there was a vendor willing to hand out Kube resources that could simplify our dev cycles. On Thu, Jul 1, 2021 at 12:52 PM M

Re: Spark on Kubernetes scheduler variety

2021-07-01 Thread Mich Talebzadeh
Hi, A rather simple question. As Kubernetes is a special work requiring some effort in setting it up properly, do we have a dev/test bed to conduct development work? What I am trying to get at is if there is official support for Volcano stuff that a vendor can provide free cluster usage in excha

Re: Spark on Kubernetes scheduler variety

2021-06-30 Thread Mich Talebzadeh
Hi Klaus, Thanks https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1289 view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or

Re: Spark on Kubernetes scheduler variety

2021-06-30 Thread Klaus Ma
Hi Mich, Would you help to open an issue at spark-on-k8s-operator repo? We're going to submit a PR to update the install steps :) -- Klaus On Wed, Jun 30, 2021 at 12:24 AM Mich Talebzadeh wrote: > Hi Yikun > > In reference > > > https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob

Re: Spark on Kubernetes scheduler variety

2021-06-30 Thread Mich Talebzadeh
Hi Michel, Thanks for the link. I am familiar with G-Research as I met them in my presentation in London back in October 2019. The amanda project sems to create super-scheduling on top of Kubernetes clusters and I quote: "Armada is an application to achieve high throughput of run-to-completion

Re: Spark on Kubernetes scheduler variety

2021-06-29 Thread Mich Talebzadeh
Hi Yikun In reference https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/volcano-integration.md Trying to install Volcano I am getting this error helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator Error: looks like "http://storage.google

Re: Spark on Kubernetes scheduler variety

2021-06-29 Thread Mich Talebzadeh
Cool, thanks! view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical conte

Re: Spark on Kubernetes scheduler variety

2021-06-28 Thread Yikun Jiang
> Is this the correct link for integrating Volcano with Spark? Yes, it is Kubernetes operator style of integrating Volcano. And if you want to just use spark submit style to submit a native support job, you can see [2] as ref. [1] https://github.com/huawei-cloudnative/spark/commit/6c1f37525f02635

Re: Spark on Kubernetes scheduler variety

2021-06-28 Thread Mich Talebzadeh
Hi Yikun, Is this the correct link for integrating Volcano with Spark? spark-on-k8s-operator/volcano-integration.md at master · GoogleCloudPlatform/spark-on-k8s-operator · GitHub Thanks Mich

Re: Spark on Kubernetes scheduler variety

2021-06-25 Thread Yikun Jiang
Oops, sorry for the error link, it should be: We will also prepare to propose an initial design and POC[3] on a shared branch (based on spark master branch) where we can collaborate on it, so I created the spark-volcano[1] org in github to make it happen. [3] https://github.com/huawei-cloudnative

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread John Zhuge
Thanks Yikun! On Thu, Jun 24, 2021 at 8:54 PM Yikun Jiang wrote: > Hi, folks. > > As @Klaus mentioned, We have some work on Spark on k8s with volcano native > support. Also, there were also some production deployment validation from > our partners in China, like JingDong, XiaoHongShu, VIPshop. >

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Yikun Jiang
Hi, folks. As @Klaus mentioned, We have some work on Spark on k8s with volcano native support. Also, there were also some production deployment validation from our partners in China, like JingDong, XiaoHongShu, VIPshop. We will also prepare to propose an initial design and POC[3] on a shared bran

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Mich Talebzadeh
Hi Holden, Thank you for your points. I guess coming from a corporate world I had an oversight on how an open source project like Spark does leverage resources and interest :). As @KlausMa kindly volunteered it would be good to hear scheduling ideas on Spark on Kubernetes and of course as I am su

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Holden Karau
Hi Mich, I certainly think making Spark on Kubernetes run well is going to be a challenge. However I think, and I could be wrong about this as well, that in terms of cluster managers Kubernetes is likely to be our future. Talking with people I don't hear about new standalone, YARN or mesos deploym

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Holden Karau
That's awesome, I'm just starting to get context around Volcano but maybe we can schedule an initial meeting for all of us interested in pursuing this to get on the same page. On Wed, Jun 23, 2021 at 6:54 PM Klaus Ma wrote: > Hi team, > > I'm kube-batch/Volcano founder, and I'm excited to hear t

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread John Zhuge
Thanks Klaus! I am interested in more details. On Wed, Jun 23, 2021 at 6:54 PM Klaus Ma wrote: > Hi team, > > I'm kube-batch/Volcano founder, and I'm excited to hear that the spark > community also has such requirements :) > > Volcano provides several features for batch workload, e.g. fair-share

Re: Spark on Kubernetes scheduler variety

2021-06-24 Thread Mich Talebzadeh
Thanks Klaus. That will be great. It will also be intuitive if you elaborate the need for this feature in line with the limitation of the current batch workload. Regards, Mich view my Linkedin profile *Disclaimer:* Use it at you

Re: Spark on Kubernetes scheduler variety

2021-06-23 Thread Klaus Ma
Hi team, I'm kube-batch/Volcano founder, and I'm excited to hear that the spark community also has such requirements :) Volcano provides several features for batch workload, e.g. fair-share, queue, reservation, preemption/reclaim and so on. It has been used in several product environments with Sp

Re: Spark on Kubernetes scheduler variety

2021-06-23 Thread Mich Talebzadeh
Please allow me to be diverse and express a different point of view on this roadmap. I believe from a technical point of view spending time and effort plus talent on batch scheduling on Kubernetes could be rewarding. However, if I may say I doubt whether such an approach and the so-called democra

Re: Spark on Kubernetes scheduler variety

2021-06-18 Thread Holden Karau
I think these approaches are good, but there are limitations (eg dynamic scaling) without us making changes inside of the Spark Kube scheduler. Certainly whichever scheduler extensions we add support for we should collaborate with the people developing those extensions insofar as they are interest

Re: Spark on Kubernetes scheduler variety

2021-06-18 Thread Mich Talebzadeh
Hi, Regarding your point and I quote ".. I know that one of the Spark on Kube operators supports volcano/kube-batch so I was thinking that might be a place I would start exploring..." There seems to be ongoing work on say Volcano as part of Cloud Native Computing Foundation

Re: Spark on Kubernetes Builder Pattern Design Document

2018-02-05 Thread Mark Hamstra
ork and migrate all the work done on the fork into the main > line. > > > > -Matt Cheah > > > > *From: *Mark Hamstra > *Date: *Monday, February 5, 2018 at 1:44 PM > *To: *Matt Cheah > *Cc: *"dev@spark.apache.org" , " > ramanath...@googl

Re: Spark on Kubernetes Builder Pattern Design Document

2018-02-05 Thread Matt Cheah
e.org" , "ramanath...@google.com" , Ilan Filonenko , Erik , Marcelo Vanzin Subject: Re: Spark on Kubernetes Builder Pattern Design Document That's good, but you should probably stop and consider whether the discussions that led up to this document's creation could hav

Re: Spark on Kubernetes Builder Pattern Design Document

2018-02-05 Thread Mark Hamstra
That's good, but you should probably stop and consider whether the discussions that led up to this document's creation could have taken place on this dev list -- because if they could have, then they probably should have as part of the whole spark-on-k8s project becoming part of mainline spark deve

Re: Spark on Kubernetes: Birds-of-a-Feather Session 12:50pm 6/6

2017-06-05 Thread lucas.g...@gmail.com
Very much looking forward to this session! Raj, it's a spark summit session: https://spark-summit.org/2017/schedule/ 12:50 PM Lunch BoF Discussion-Deep Learning on Apache Spark - Jason Dai (Intel) There are increasing interest and applicatio

RE: Spark on Kubernetes: Birds-of-a-Feather Session 12:50pm 6/6

2017-06-05 Thread Raj, Deepu
HI Erik, Can you please share the details (Timezone, Webex Details)? Thanks, Deepu Raj From: Erik Erlandson [mailto:eerla...@redhat.com] Sent: Tuesday, 6 June 2017 10:28 AM To: dev@spark.apache.org Subject: Spark on Kubernetes: Birds-of-a-Feather Session 12:50pm 6/6 Come learn about the communi

Re: spark on kubernetes

2016-05-23 Thread Gurvinder Singh
OK created this issue https://issues.apache.org/jira/browse/SPARK-15487 please comment on this and also let me know if anyone want to collaborate on implementing it. Its my first contribution to Spark so will be exciting. - Gurvinder On 05/23/2016 07:55 PM, Gurvinder Singh wrote: > On 05/23/2016 0

Re: spark on kubernetes

2016-05-23 Thread Gurvinder Singh
On 05/23/2016 07:18 PM, Radoslaw Gruchalski wrote: > Sounds surprisingly close to this: > https://github.com/apache/spark/pull/9608 > I might have overlooked it but bridge mode work appears to make Spark work with docker containers and able to communicate with them when running on more than one ma

Re: spark on kubernetes

2016-05-23 Thread Radoslaw Gruchalski
Sounds surprisingly close to this: https://github.com/apache/spark/pull/9608 I can ressurect the work on the bridge mode for Spark 2. The reason why the work on the old one was suspended was because Spark was going through so many changes at that time that a lot of work done, was wiped out by th

Re: spark on kubernetes

2016-05-23 Thread Timothy Chen
This will also simplify Mesos users as well, DCOS has to work around this with our own proxying. Tim On Sun, May 22, 2016 at 11:53 PM, Gurvinder Singh wrote: > Hi Reynold, > > So if that's OK with you, can I go ahead and create JIRA for this. As it > seems this feature is missing currently and c

Re: spark on kubernetes

2016-05-22 Thread Gurvinder Singh
Hi Reynold, So if that's OK with you, can I go ahead and create JIRA for this. As it seems this feature is missing currently and can benefit not just for kubernetes users but in general Spark standalone mode users too. - Gurvinder On 05/22/2016 12:49 PM, Gurvinder Singh wrote: > On 05/22/2016 10:

Re: spark on kubernetes

2016-05-22 Thread Gurvinder Singh
On 05/22/2016 10:23 AM, Sun Rui wrote: > If it is possible to rewrite URL in outbound responses in Knox or other > reverse proxy, would that solve your issue? Any process which can keep track of workers and application drivers IP addresses and route traffic to those will work. Considering Spark Ma

Re: spark on kubernetes

2016-05-22 Thread Sun Rui
If it is possible to rewrite URL in outbound responses in Knox or other reverse proxy, would that solve your issue? > On May 22, 2016, at 14:55, Gurvinder Singh wrote: > > On 05/22/2016 08:32 AM, Reynold Xin wrote: >> Kubernetes itself already has facilities for http proxy, doesn't it? >> > Yea

Re: spark on kubernetes

2016-05-21 Thread Gurvinder Singh
On 05/22/2016 08:32 AM, Reynold Xin wrote: > Kubernetes itself already has facilities for http proxy, doesn't it? > Yeah kubernetes has ingress controller which can act the L7 load balancer and router traffic to Spark UI in this case. But I am referring to link present in UI to worker and applicat

Re: spark on kubernetes

2016-05-21 Thread Gurvinder Singh
On 05/22/2016 08:30 AM, Sun Rui wrote: > I think “reverse proxy” is beneficial to monitoring a cluster in a > secure way. This feature is not only desired for Spark on standalone, > but also Spark on YARN, and also projects other than spark. I think to secure the Spark you can use any reverse prox

Re: spark on kubernetes

2016-05-21 Thread Reynold Xin
Kubernetes itself already has facilities for http proxy, doesn't it? On Sat, May 21, 2016 at 9:30 AM, Gurvinder Singh wrote: > Hi, > > I am currently working on deploying Spark on kuberentes (K8s) and it is > working fine. I am running Spark with standalone mode and checkpointing > the state to

Re: spark on kubernetes

2016-05-21 Thread Sun Rui
I think “reverse proxy” is beneficial to monitoring a cluster in a secure way. This feature is not only desired for Spark on standalone, but also Spark on YARN, and also projects other than spark. Maybe Apache Knox can help you. Not sure how Knox can integrate with Spark. > On May 22, 2016, at