Re: Apache Spark 3.4.3 (?)

2024-04-07 Thread Jungtaek Lim
Sounds like a plan. +1 (non-binding) Thanks for volunteering! On Sun, Apr 7, 2024 at 5:45 AM Dongjoon Hyun wrote: > Hi, All. > > Apache Spark 3.4.2 tag was created on Nov 24th and `branch-3.4` has 85 > commits including important security and correctness patches like > SPARK-45580, SPARK-46092,

Fwd: Apache Spark 3.4.3 (?)

2024-04-07 Thread Mich Talebzadeh
Mich Talebzadeh, Technologist | Solutions Architect | Data Engineer | Generative AI London United Kingdom view my Linkedin profile https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* The information provided is correct to

Re: Apache Spark 3.4.3 (?)

2024-04-07 Thread L. C. Hsieh
+1 Thanks Dongjoon! On Sun, Apr 7, 2024 at 1:56 AM Kent Yao wrote: > > +1, thank you, Dongjoon > > > Kent > > Holden Karau 于2024年4月7日周日 14:54写道: > > > > Sounds good to me :) > > > > Twitter: https://twitter.com/holdenkarau > > Books (Learning Spark, High Performance Spark, etc.): > > https://a

Re: External Spark shuffle service for k8s

2024-04-07 Thread Mich Talebzadeh
Thanks Cheng for the heads up. I will have a look. Cheers Mich Talebzadeh, Technologist | Solutions Architect | Data Engineer | Generative AI London United Kingdom view my Linkedin profile https://en.everybodywiki.com/Mich_Talebz

Re: External Spark shuffle service for k8s

2024-04-07 Thread Cheng Pan
Instead of External Shuffle Shufle, Apache Celeborn might be a good option as a Remote Shuffle Service for Spark on K8s. There are some useful resources you might be interested in. [1] https://celeborn.apache.org/ [2] https://www.youtube.com/watch?v=s5xOtG6Venw [3] https://github.com/aws-samples

Re: External Spark shuffle service for k8s

2024-04-07 Thread Mich Talebzadeh
Splendid The configurations below can be used with k8s deployments of Spark. Spark applications running on k8s can utilize these configurations to seamlessly access data stored in Google Cloud Storage (GCS) and Amazon S3. For Google GCS we may have spark_config_gcs = { "spark.kubernetes.auth

Re: External Spark shuffle service for k8s

2024-04-07 Thread Vakaris Baškirov
There is an IBM shuffle service plugin that supports S3 https://github.com/IBM/spark-s3-shuffle Though I would think a feature like this could be a part of the main Spark repo. Trino already has out-of-box support for s3 exchange (shuffle) and it's very useful. Vakaris On Sun, Apr 7, 2024 at 12:

Re: Apache Spark 3.4.3 (?)

2024-04-07 Thread Kent Yao
+1, thank you, Dongjoon Kent Holden Karau 于2024年4月7日周日 14:54写道: > > Sounds good to me :) > > Twitter: https://twitter.com/holdenkarau > Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 > YouTube Live Streams: https://www.youtube.com/user/holdenkarau > > > On Sat, Ap