Re: [DISCUSS] Flink Docker Playgrounds

Till Rohrmann Thu, 08 Aug 2019 09:36:02 -0700

I would be in favour of option 1.

We could also think about making the flink-playgrounds and the Flink docker
image release part of the Flink release process [1] if we don't want to
have independent release cycles. I think at the moment the official Flink
docker image is too often forgotten.


[1]
https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release

Cheers,
Till

On Thu, Aug 8, 2019 at 6:25 PM Seth Wiesman <[email protected]> wrote:

> Hey Fabian,
>
> I support option 1.
>
>  As per FLIP-42, playgrounds are going to become core to flinks getting
> started experience and I believe it is worth the effort to get this right.
>
> - As you mentioned, we may (and in my opinion definitely will) add more
> images in the future. Setting up an integration now will set the stage for
> those future additions.
>
> - These images will be many users first exposure to Flink and having a
> proper release cycle to ensure they work properly may be worth the effort
> in and of itself. We already found during the first PR to that repo that we
> needed to find users with different OSs to test.
>
> - Similarly to the above point, having the images hosted under an official
> Apache account adds a certain amount of credibility and shows the community
> that we take on-boarding new users seriously.
>
> - I am generally opposed having the official flink docs rely on something
> that is hosted under someone’s personal account. I don’t want bug fixes or
> updates to be blocked by your (or some else’s ) availability.
>
> Seth
>
> > On Aug 8, 2019, at 10:36 AM, Fabian Hueske <[email protected]> wrote:
> >
> > Hi everyone,
> >
> > As you might know, some of us are currently working on Docker-based
> > playgrounds that make it very easy for first-time Flink users to try out
> > and play with Flink [0].
> >
> > Our current setup (still work in progress with some parts merged to the
> > master branch) looks as follows:
> > * The playground is a Docker Compose environment [1] consisting of Flink,
> > Kafka, and Zookeeper images (ZK for Kafka). The playground is based on a
> > specific Flink job.
> > * We had planned to add the example job of the playground as an example
> to
> > the flink main repository to bundle it with the Flink distribution.
> Hence,
> > it would have been included in the Docker-hub-official (soon to be
> > published) Flink 1.9 Docker image [2].
> > * The main motivation of adding the job to the examples module in the
> flink
> > main repo was to avoid the maintenance overhead for a customized Docker
> > image.
> >
> > When discussing to backport the playground job (and its data generator)
> to
> > include it in the Flink 1.9 examples, concerns were raised about their
> > Kafka dependency which will become a problem, if the community agrees on
> > the recently proposed repository split, which would remove flink-kafka
> from
> > the main repository [3]. I think this is a fair concern, that we did not
> > consider when designing the playground (also the repo split was not
> > proposed yet).
> >
> > If we don't add the playground job to the examples, we need to put it
> > somewhere else. The obvious choice would be the flink-playgrounds [4]
> > repository, which was intended for the docker-compose configuration
> files.
> > However, we would not be able to include it in the Docker-hub-official
> > Flink image any more and would need to maintain a custom Docker image,
> what
> > we tried to avoid. The custom image would of course be based on the
> > Docker-hub-official Flink image.
> >
> > There are different approaches for this:
> >
> > 1) Building one (or more) official ASF images
> > There is an official Apache Docker Hub user [5] and a bunch of projects
> > publish Docker images via this user. Apache Infra seems to support an
> > process that automatically builds and publishes Docker images when a
> > release tag is added to a repository. This feature needs to be enabled. I
> > haven't found detailed documentation on this but there is a bunch of
> INFRA
> > Jira tickets that discuss this mechanism.
> > This approach would mean that we need a formal Apache release for
> > flink-playgrounds (similar to flink-shaded). The obvious benefits are
> that
> > these images would be ASF-official Docker images. In case we can publish
> > more than one image per repo, we could also publish images for other
> > playgrounds (like the SQL playground, which could be based on the SQL
> > training that I built [6] which uses an image that is published under my
> > user [7]).
> >
> > 2) Rely on an external image
> > This image could be build by somebody in the community (like me). Problem
> > is of course, that the image is not an official image and we would rely
> on
> > a volunteer to build the images.
> > OTOH, the overhead would be pretty small. No need to roll run full
> > releases, integration with Infra's build process, etc.
> >
> > IMO, the first approach is clearly the better choice but also needs a
> bunch
> > of things to be put into place.
> >
> > What do others think?
> > Does somebody have another idea?
> >
> > Cheers,
> > Fabian
> >
> > [0]
> >
> https://ci.apache.org/projects/flink/flink-docs-master/getting-started/docker-playgrounds/flink_cluster_playground.html
> > [1]
> >
> https://ci.apache.org/projects/flink/flink-docs-master/getting-started/docker-playgrounds/flink_cluster_playground.html#anatomy-of-this-playground
> > [2] https://hub.docker.com/_/flink
> > [3]
> >
> https://lists.apache.org/thread.html/eb841f610ef2c191b8d00b6c07b2eab513da2e4eb2d7da5c5e6846f4@%3Cdev.flink.apache.org%3E
> > [4] https://github.com/apache/flink-playgrounds
> > [5] https://hub.docker.com/u/apache
> > [6] https://github.com/ververica/sql-training/
> > [7] https://hub.docker.com/r/fhueske/flink-sql-client-training-1.7.2
>

Re: [DISCUSS] Flink Docker Playgrounds

Reply via email to