I concur this is a good idea and certainly worth exploring.

In practice, preparing docker images as deployable will throw some
challenges because creating docker for Spark  is not really a singular
modular unit, say  creating docker for Jenkins. It involves different
versions and different images for Spark and PySpark and most likely will
end up as part of Kubernetes deployment.


Individuals and organisations will deploy it as the first cut. Great but I
equally feel that good documentation on how to build a consumable
deployable image will be more valuable.  FRom my own experience the current
documentation should be enhanced, for example how to deploy working
directories, additional Python packages, build with different Java
versions  (version 8 or version 11) etc.


HTH


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Fri, 13 Aug 2021 at 01:54, Holden Karau <hol...@pigscanfly.ca> wrote:

> Awesome, I've filed an INFRA ticket to get the ball rolling.
>
> On Thu, Aug 12, 2021 at 5:48 PM John Zhuge <jzh...@apache.org> wrote:
>
>> +1
>>
>> On Thu, Aug 12, 2021 at 5:44 PM Hyukjin Kwon <gurwls...@gmail.com> wrote:
>>
>>> +1, I think we generally agreed upon having it. Thanks Holden for
>>> headsup and driving this.
>>>
>>> +@Dongjoon Hyun <dongj...@apache.org> FYI
>>>
>>> 2021년 7월 22일 (목) 오후 12:22, Kent Yao <yaooq...@gmail.com>님이 작성:
>>>
>>>> +1
>>>>
>>>> Bests,
>>>>
>>>> *Kent Yao *
>>>> @ Data Science Center, Hangzhou Research Institute, NetEase Corp.
>>>> *a spark enthusiast*
>>>> *kyuubi <https://github.com/yaooqinn/kyuubi>is a
>>>> unified multi-tenant JDBC interface for large-scale data processing and
>>>> analytics, built on top of Apache Spark <http://spark.apache.org/>.*
>>>> *spark-authorizer <https://github.com/yaooqinn/spark-authorizer>A Spark
>>>> SQL extension which provides SQL Standard Authorization for **Apache
>>>> Spark <http://spark.apache.org/>.*
>>>> *spark-postgres <https://github.com/yaooqinn/spark-postgres> A library
>>>> for reading data from and transferring data to Postgres / Greenplum with
>>>> Spark SQL and DataFrames, 10~100x faster.*
>>>> *itatchi <https://github.com/yaooqinn/spark-func-extras>A** library t**hat
>>>> brings useful functions from various modern database management systems to 
>>>> **Apache
>>>> Spark <http://spark.apache.org/>.*
>>>>
>>>>
>>>>
>>>> On 07/22/2021 11:13,Holden Karau<hol...@pigscanfly.ca>
>>>> <hol...@pigscanfly.ca> wrote:
>>>>
>>>> Hi Folks,
>>>>
>>>> Many other distributed computing (
>>>> https://hub.docker.com/r/rayproject/ray
>>>> https://hub.docker.com/u/daskdev) and ASF projects (
>>>> https://hub.docker.com/u/apache) now publish their images to dockerhub.
>>>>
>>>> We've already got the docker image tooling in place, I think we'd need
>>>> to ask the ASF to grant permissions to the PMC to publish containers and
>>>> update the release steps but I think this could be useful for folks.
>>>>
>>>> Cheers,
>>>>
>>>> Holden
>>>>
>>>> --
>>>> Twitter: https://twitter.com/holdenkarau
>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>> --
>> John Zhuge
>>
>
>
> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>

Reply via email to