I concur this is a good idea and certainly worth exploring. In practice, preparing docker images as deployable will throw some challenges because creating docker for Spark is not really a singular modular unit, say creating docker for Jenkins. It involves different versions and different images for Spark and PySpark and most likely will end up as part of Kubernetes deployment.
Individuals and organisations will deploy it as the first cut. Great but I equally feel that good documentation on how to build a consumable deployable image will be more valuable. FRom my own experience the current documentation should be enhanced, for example how to deploy working directories, additional Python packages, build with different Java versions (version 8 or version 11) etc. HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Fri, 13 Aug 2021 at 01:54, Holden Karau <hol...@pigscanfly.ca> wrote: > Awesome, I've filed an INFRA ticket to get the ball rolling. > > On Thu, Aug 12, 2021 at 5:48 PM John Zhuge <jzh...@apache.org> wrote: > >> +1 >> >> On Thu, Aug 12, 2021 at 5:44 PM Hyukjin Kwon <gurwls...@gmail.com> wrote: >> >>> +1, I think we generally agreed upon having it. Thanks Holden for >>> headsup and driving this. >>> >>> +@Dongjoon Hyun <dongj...@apache.org> FYI >>> >>> 2021년 7월 22일 (목) 오후 12:22, Kent Yao <yaooq...@gmail.com>님이 작성: >>> >>>> +1 >>>> >>>> Bests, >>>> >>>> *Kent Yao * >>>> @ Data Science Center, Hangzhou Research Institute, NetEase Corp. >>>> *a spark enthusiast* >>>> *kyuubi <https://github.com/yaooqinn/kyuubi>is a >>>> unified multi-tenant JDBC interface for large-scale data processing and >>>> analytics, built on top of Apache Spark <http://spark.apache.org/>.* >>>> *spark-authorizer <https://github.com/yaooqinn/spark-authorizer>A Spark >>>> SQL extension which provides SQL Standard Authorization for **Apache >>>> Spark <http://spark.apache.org/>.* >>>> *spark-postgres <https://github.com/yaooqinn/spark-postgres> A library >>>> for reading data from and transferring data to Postgres / Greenplum with >>>> Spark SQL and DataFrames, 10~100x faster.* >>>> *itatchi <https://github.com/yaooqinn/spark-func-extras>A** library t**hat >>>> brings useful functions from various modern database management systems to >>>> **Apache >>>> Spark <http://spark.apache.org/>.* >>>> >>>> >>>> >>>> On 07/22/2021 11:13,Holden Karau<hol...@pigscanfly.ca> >>>> <hol...@pigscanfly.ca> wrote: >>>> >>>> Hi Folks, >>>> >>>> Many other distributed computing ( >>>> https://hub.docker.com/r/rayproject/ray >>>> https://hub.docker.com/u/daskdev) and ASF projects ( >>>> https://hub.docker.com/u/apache) now publish their images to dockerhub. >>>> >>>> We've already got the docker image tooling in place, I think we'd need >>>> to ask the ASF to grant permissions to the PMC to publish containers and >>>> update the release steps but I think this could be useful for folks. >>>> >>>> Cheers, >>>> >>>> Holden >>>> >>>> -- >>>> Twitter: https://twitter.com/holdenkarau >>>> Books (Learning Spark, High Performance Spark, etc.): >>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> >>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>> -- >> John Zhuge >> > > > -- > Twitter: https://twitter.com/holdenkarau > Books (Learning Spark, High Performance Spark, etc.): > https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau >