I do my own dev work on a personal cluster I have down in Fremont which I’ve got setup using k3sup. I know some devs use minikube (and our integration tests can). But yeah if there was a vendor willing to hand out Kube resources that could simplify our dev cycles.
On Thu, Jul 1, 2021 at 12:52 PM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi, > > A rather simple question. > > As Kubernetes is a special work requiring some effort in setting it up > properly, do we have a dev/test bed to conduct development work? > > What I am trying to get at is if there is official support for Volcano > stuff that a vendor can provide free cluster usage in exchange for R & D. > For example Google themselves? > > Thanks, > > Mich > > > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Thu, 1 Jul 2021 at 05:00, Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > >> Hi Klaus, >> >> Thanks >> >> https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/1289 >> >> >> >> >> view my Linkedin profile >> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> >> >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility for >> any loss, damage or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> >> On Thu, 1 Jul 2021 at 03:16, Klaus Ma <klaus1982...@gmail.com> wrote: >> >>> Hi Mich, >>> >>> Would you help to open an issue at spark-on-k8s-operator repo? We're >>> going to submit a PR to update the install steps :) >>> >>> -- Klaus >>> >>> On Wed, Jun 30, 2021 at 12:24 AM Mich Talebzadeh < >>> mich.talebza...@gmail.com> wrote: >>> >>>> Hi Yikun >>>> >>>> In reference >>>> >>>> >>>> https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/volcano-integration.md >>>> >>>> Trying to install Volcano I am getting this error >>>> >>>> helm repo add incubator >>>> http://storage.googleapis.com/kubernetes-charts-incubator >>>> Error: looks like " >>>> http://storage.googleapis.com/kubernetes-charts-incubator" is not a >>>> valid chart repository or cannot be reached: failed to fetch >>>> http://storage.googleapis.com/kubernetes-charts-incubator/index.yaml : >>>> 404 Not Found >>>> >>>> Any ideas will be appreciated. >>>> >>>> Thanks, >>>> >>>> Mich >>>> >>>> >>>> >>>> view my Linkedin profile >>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>> >>>> >>>> >>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>> any loss, damage or destruction of data or any other property which may >>>> arise from relying on this email's technical content is explicitly >>>> disclaimed. The author will in no case be liable for any monetary damages >>>> arising from such loss, damage or destruction. >>>> >>>> >>>> >>>> >>>> On Tue, 29 Jun 2021 at 09:14, Mich Talebzadeh < >>>> mich.talebza...@gmail.com> wrote: >>>> >>>>> Cool, thanks! >>>>> >>>>> >>>>> >>>>> view my Linkedin profile >>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>> >>>>> >>>>> >>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>>> any loss, damage or destruction of data or any other property which may >>>>> arise from relying on this email's technical content is explicitly >>>>> disclaimed. The author will in no case be liable for any monetary damages >>>>> arising from such loss, damage or destruction. >>>>> >>>>> >>>>> >>>>> >>>>> On Tue, 29 Jun 2021 at 07:33, Yikun Jiang <yikunk...@gmail.com> wrote: >>>>> >>>>>> > Is this the correct link for integrating Volcano with Spark? >>>>>> >>>>>> Yes, it is Kubernetes operator style of integrating Volcano. And if >>>>>> you want to just use spark submit style to submit a native support job, >>>>>> you >>>>>> can see [2] as ref. >>>>>> >>>>>> [1] >>>>>> https://github.com/huawei-cloudnative/spark/commit/6c1f37525f026353eaead34216d47dad653f13a4 >>>>>> >>>>>> Regards, >>>>>> Yikun >>>>>> >>>>>> >>>>>> Mich Talebzadeh <mich.talebza...@gmail.com> 于2021年6月28日周一 下午6:03写道: >>>>>> >>>>>>> Hi Yikun, >>>>>>> >>>>>>> Is this the correct link for integrating Volcano with Spark? >>>>>>> >>>>>>> spark-on-k8s-operator/volcano-integration.md at master · >>>>>>> GoogleCloudPlatform/spark-on-k8s-operator · GitHub >>>>>>> <https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/volcano-integration.md> >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> >>>>>>> Mich >>>>>>> >>>>>>> >>>>>>> view my Linkedin profile >>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>>> >>>>>>> >>>>>>> >>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility >>>>>>> for any loss, damage or destruction of data or any other property which >>>>>>> may >>>>>>> arise from relying on this email's technical content is explicitly >>>>>>> disclaimed. The author will in no case be liable for any monetary >>>>>>> damages >>>>>>> arising from such loss, damage or destruction. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, 25 Jun 2021 at 09:45, Yikun Jiang <yikunk...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Oops, sorry for the error link, it should be: >>>>>>>> >>>>>>>> We will also prepare to propose an initial design and POC[3] on a >>>>>>>> shared branch (based on spark master branch) where we can collaborate >>>>>>>> on >>>>>>>> it, so I created the spark-volcano[1] org in github to make it happen. >>>>>>>> >>>>>>>> [3] >>>>>>>> https://github.com/huawei-cloudnative/spark/commit/6c1f37525f026353eaead34216d47dad653f13a4 >>>>>>>> >>>>>>>> >>>>>>>> And >>>>>>>> Regards, >>>>>>>> Yikun >>>>>>>> >>>>>>>> >>>>>>>> Yikun Jiang <yikunk...@gmail.com> 于2021年6月25日周五 上午11:53写道: >>>>>>>> >>>>>>>>> Hi, folks. >>>>>>>>> >>>>>>>>> As @Klaus mentioned, We have some work on Spark on k8s with >>>>>>>>> volcano native support. Also, there were also some production >>>>>>>>> deployment >>>>>>>>> validation from our partners in China, like JingDong, XiaoHongShu, >>>>>>>>> VIPshop. >>>>>>>>> >>>>>>>>> We will also prepare to propose an initial design and POC[3] on a >>>>>>>>> shared branch (based on spark master branch) where we can collaborate >>>>>>>>> on >>>>>>>>> it, so I created the spark-volcano[1] org in github to make it happen. >>>>>>>>> >>>>>>>>> Pls feel free to comment on it [2] if you guys have any questions >>>>>>>>> or concerns. >>>>>>>>> >>>>>>>>> [1] https://github.com/spark-volcano >>>>>>>>> [2] https://github.com/spark-volcano/spark/issues/1 >>>>>>>>> [3] >>>>>>>>> https://github.com/huawei-cloudnative/spark/commit/6c1f37525f026353eaead34216d47dad653f13a4 >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Yikun >>>>>>>>> >>>>>>>>> Holden Karau <hol...@pigscanfly.ca> 于2021年6月25日周五 上午12:00写道: >>>>>>>>> >>>>>>>>>> Hi Mich, >>>>>>>>>> >>>>>>>>>> I certainly think making Spark on Kubernetes run well is going to >>>>>>>>>> be a challenge. However I think, and I could be wrong about this as >>>>>>>>>> well, >>>>>>>>>> that in terms of cluster managers Kubernetes is likely to be our >>>>>>>>>> future. >>>>>>>>>> Talking with people I don't hear about new standalone, YARN or mesos >>>>>>>>>> deployments of Spark, but I do hear about people trying to migrate to >>>>>>>>>> Kubernetes. >>>>>>>>>> >>>>>>>>>> To be clear I certainly agree that we need more work on >>>>>>>>>> structured streaming, but its important to remember that the Spark >>>>>>>>>> developers are not all fully interchangeable, we work on the things >>>>>>>>>> that >>>>>>>>>> we're interested in pursuing so even if structured streaming needs >>>>>>>>>> more >>>>>>>>>> love if I'm not super interested in structured streaming I'm less >>>>>>>>>> likely to >>>>>>>>>> work on it. That being said I am certainly spinning up a bit more in >>>>>>>>>> the >>>>>>>>>> Spark SQL area especially around our data source/connectors because >>>>>>>>>> I can >>>>>>>>>> see the need there too. >>>>>>>>>> >>>>>>>>>> On Wed, Jun 23, 2021 at 8:26 AM Mich Talebzadeh < >>>>>>>>>> mich.talebza...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Please allow me to be diverse and express a different point of >>>>>>>>>>> view on this roadmap. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I believe from a technical point of view spending time and >>>>>>>>>>> effort plus talent on batch scheduling on Kubernetes could be >>>>>>>>>>> rewarding. >>>>>>>>>>> However, if I may say I doubt whether such an approach and the >>>>>>>>>>> so-called >>>>>>>>>>> democratization of Spark on whatever platform is really should be >>>>>>>>>>> of great >>>>>>>>>>> focus. >>>>>>>>>>> >>>>>>>>>>> Having worked on Google Dataproc >>>>>>>>>>> <https://cloud.google.com/dataproc> (A fully managed and highly >>>>>>>>>>> scalable service for running Apache Spark, Hadoop and more recently >>>>>>>>>>> other >>>>>>>>>>> artefacts) for that past two years, and Spark on Kubernetes >>>>>>>>>>> on-premise, I have come to the conclusion that Spark is not a beast >>>>>>>>>>> that >>>>>>>>>>> that one can fully commoditize it much like one can do with >>>>>>>>>>> Zookeeper, >>>>>>>>>>> Kafka etc. There is always a struggle to make some niche areas of >>>>>>>>>>> Spark >>>>>>>>>>> like Spark Structured Streaming (SSS) work seamlessly and >>>>>>>>>>> effortlessly on >>>>>>>>>>> these commercial platforms with whatever as a Service. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Moreover, Spark (and I stand corrected) from the ground up has >>>>>>>>>>> already a lot of resiliency and redundancy built in. It is truly an >>>>>>>>>>> enterprise class product (requires enterprise class support) that >>>>>>>>>>> will be >>>>>>>>>>> difficult to commoditize with Kubernetes and expect the same >>>>>>>>>>> performance. >>>>>>>>>>> After all, Kubernetes is aimed at efficient resource sharing and >>>>>>>>>>> potential >>>>>>>>>>> cost saving for the mass market. In short I can see commercial >>>>>>>>>>> enterprises >>>>>>>>>>> will work on these platforms ,but may be the great talents on dev >>>>>>>>>>> team >>>>>>>>>>> should focus on stuff like the perceived limitation of SSS in >>>>>>>>>>> dealing with >>>>>>>>>>> chain of aggregation( if I am correct it is not yet supported on >>>>>>>>>>> streaming >>>>>>>>>>> datasets) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> These are my opinions and they are not facts, just opinions so >>>>>>>>>>> to speak :) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> view my Linkedin profile >>>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all >>>>>>>>>>> responsibility for any loss, damage or destruction of data or any >>>>>>>>>>> other >>>>>>>>>>> property which may arise from relying on this email's technical >>>>>>>>>>> content is >>>>>>>>>>> explicitly disclaimed. The author will in no case be liable for any >>>>>>>>>>> monetary damages arising from such loss, damage or destruction. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, 18 Jun 2021 at 23:18, Holden Karau <hol...@pigscanfly.ca> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> I think these approaches are good, but there are limitations >>>>>>>>>>>> (eg dynamic scaling) without us making changes inside of the Spark >>>>>>>>>>>> Kube >>>>>>>>>>>> scheduler. >>>>>>>>>>>> >>>>>>>>>>>> Certainly whichever scheduler extensions we add support for we >>>>>>>>>>>> should collaborate with the people developing those extensions >>>>>>>>>>>> insofar as >>>>>>>>>>>> they are interested. My first place that I checked was >>>>>>>>>>>> #sig-scheduling >>>>>>>>>>>> which is fairly quite on the Kubernetes slack but if there are >>>>>>>>>>>> more places >>>>>>>>>>>> to look for folks interested in batch scheduling on Kubernetes we >>>>>>>>>>>> should >>>>>>>>>>>> definitely give it a shot :) >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jun 18, 2021 at 1:41 AM Mich Talebzadeh < >>>>>>>>>>>> mich.talebza...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> Regarding your point and I quote >>>>>>>>>>>>> >>>>>>>>>>>>> ".. I know that one of the Spark on Kube operators >>>>>>>>>>>>> supports volcano/kube-batch so I was thinking that might be a >>>>>>>>>>>>> place I would >>>>>>>>>>>>> start exploring..." >>>>>>>>>>>>> >>>>>>>>>>>>> There seems to be ongoing work on say Volcano as part of Cloud >>>>>>>>>>>>> Native Computing Foundation <https://cncf.io/> (CNCF). For >>>>>>>>>>>>> example through https://github.com/volcano-sh/volcano >>>>>>>>>>>>> >>>>>>>>>>>> <https://github.com/volcano-sh/volcano> >>>>>>>>>>>>> >>>>>>>>>>>>> There may be value-add in collaborating with such groups >>>>>>>>>>>>> through CNCF in order to have a collective approach to such work. >>>>>>>>>>>>> There >>>>>>>>>>>>> also seems to be some work on Integration of Spark with >>>>>>>>>>>>> Volcano for Batch Scheduling. >>>>>>>>>>>>> <https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/volcano-integration.md> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> What is not very clear is the degree of progress of these >>>>>>>>>>>>> projects. You may be kind enough to elaborate on KPI for each of >>>>>>>>>>>>> these >>>>>>>>>>>>> projects and where you think your contributions is going to be. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> HTH, >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Mich >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> view my Linkedin profile >>>>>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all >>>>>>>>>>>>> responsibility for any loss, damage or destruction of data or any >>>>>>>>>>>>> other >>>>>>>>>>>>> property which may arise from relying on this email's technical >>>>>>>>>>>>> content is >>>>>>>>>>>>> explicitly disclaimed. The author will in no case be liable for >>>>>>>>>>>>> any >>>>>>>>>>>>> monetary damages arising from such loss, damage or destruction. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, 18 Jun 2021 at 00:44, Holden Karau < >>>>>>>>>>>>> hol...@pigscanfly.ca> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Folks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'm continuing my adventures to make Spark on containers >>>>>>>>>>>>>> party and I >>>>>>>>>>>>>> was wondering if folks have experience with the different >>>>>>>>>>>>>> batch >>>>>>>>>>>>>> scheduler options that they prefer? I was thinking so that we >>>>>>>>>>>>>> can >>>>>>>>>>>>>> better support dynamic allocation it might make sense for us >>>>>>>>>>>>>> to >>>>>>>>>>>>>> support using different schedulers and I wanted to see if >>>>>>>>>>>>>> there are >>>>>>>>>>>>>> any that the community is more interested in? >>>>>>>>>>>>>> >>>>>>>>>>>>>> I know that one of the Spark on Kube operators supports >>>>>>>>>>>>>> volcano/kube-batch so I was thinking that might be a place I >>>>>>>>>>>>>> start >>>>>>>>>>>>>> exploring but also want to be open to other schedulers that >>>>>>>>>>>>>> folks >>>>>>>>>>>>>> might be interested in. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Holden :) >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>>>>>>>>> https://amzn.to/2MaRAG9 >>>>>>>>>>>>>> YouTube Live Streams: >>>>>>>>>>>>>> https://www.youtube.com/user/holdenkarau >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>>>>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>>>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Twitter: https://twitter.com/holdenkarau >>>>>>>>>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>>>>>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>>>>>>>> >>>>>>>>> -- Twitter: https://twitter.com/holdenkarau Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau