I would be in favour of option 1. We could also think about making the flink-playgrounds and the Flink docker image release part of the Flink release process [1] if we don't want to have independent release cycles. I think at the moment the official Flink docker image is too often forgotten.
[1] https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release Cheers, Till On Thu, Aug 8, 2019 at 6:25 PM Seth Wiesman <sjwies...@gmail.com> wrote: > Hey Fabian, > > I support option 1. > > As per FLIP-42, playgrounds are going to become core to flinks getting > started experience and I believe it is worth the effort to get this right. > > - As you mentioned, we may (and in my opinion definitely will) add more > images in the future. Setting up an integration now will set the stage for > those future additions. > > - These images will be many users first exposure to Flink and having a > proper release cycle to ensure they work properly may be worth the effort > in and of itself. We already found during the first PR to that repo that we > needed to find users with different OSs to test. > > - Similarly to the above point, having the images hosted under an official > Apache account adds a certain amount of credibility and shows the community > that we take on-boarding new users seriously. > > - I am generally opposed having the official flink docs rely on something > that is hosted under someone’s personal account. I don’t want bug fixes or > updates to be blocked by your (or some else’s ) availability. > > Seth > > > On Aug 8, 2019, at 10:36 AM, Fabian Hueske <fhue...@gmail.com> wrote: > > > > Hi everyone, > > > > As you might know, some of us are currently working on Docker-based > > playgrounds that make it very easy for first-time Flink users to try out > > and play with Flink [0]. > > > > Our current setup (still work in progress with some parts merged to the > > master branch) looks as follows: > > * The playground is a Docker Compose environment [1] consisting of Flink, > > Kafka, and Zookeeper images (ZK for Kafka). The playground is based on a > > specific Flink job. > > * We had planned to add the example job of the playground as an example > to > > the flink main repository to bundle it with the Flink distribution. > Hence, > > it would have been included in the Docker-hub-official (soon to be > > published) Flink 1.9 Docker image [2]. > > * The main motivation of adding the job to the examples module in the > flink > > main repo was to avoid the maintenance overhead for a customized Docker > > image. > > > > When discussing to backport the playground job (and its data generator) > to > > include it in the Flink 1.9 examples, concerns were raised about their > > Kafka dependency which will become a problem, if the community agrees on > > the recently proposed repository split, which would remove flink-kafka > from > > the main repository [3]. I think this is a fair concern, that we did not > > consider when designing the playground (also the repo split was not > > proposed yet). > > > > If we don't add the playground job to the examples, we need to put it > > somewhere else. The obvious choice would be the flink-playgrounds [4] > > repository, which was intended for the docker-compose configuration > files. > > However, we would not be able to include it in the Docker-hub-official > > Flink image any more and would need to maintain a custom Docker image, > what > > we tried to avoid. The custom image would of course be based on the > > Docker-hub-official Flink image. > > > > There are different approaches for this: > > > > 1) Building one (or more) official ASF images > > There is an official Apache Docker Hub user [5] and a bunch of projects > > publish Docker images via this user. Apache Infra seems to support an > > process that automatically builds and publishes Docker images when a > > release tag is added to a repository. This feature needs to be enabled. I > > haven't found detailed documentation on this but there is a bunch of > INFRA > > Jira tickets that discuss this mechanism. > > This approach would mean that we need a formal Apache release for > > flink-playgrounds (similar to flink-shaded). The obvious benefits are > that > > these images would be ASF-official Docker images. In case we can publish > > more than one image per repo, we could also publish images for other > > playgrounds (like the SQL playground, which could be based on the SQL > > training that I built [6] which uses an image that is published under my > > user [7]). > > > > 2) Rely on an external image > > This image could be build by somebody in the community (like me). Problem > > is of course, that the image is not an official image and we would rely > on > > a volunteer to build the images. > > OTOH, the overhead would be pretty small. No need to roll run full > > releases, integration with Infra's build process, etc. > > > > IMO, the first approach is clearly the better choice but also needs a > bunch > > of things to be put into place. > > > > What do others think? > > Does somebody have another idea? > > > > Cheers, > > Fabian > > > > [0] > > > https://ci.apache.org/projects/flink/flink-docs-master/getting-started/docker-playgrounds/flink_cluster_playground.html > > [1] > > > https://ci.apache.org/projects/flink/flink-docs-master/getting-started/docker-playgrounds/flink_cluster_playground.html#anatomy-of-this-playground > > [2] https://hub.docker.com/_/flink > > [3] > > > https://lists.apache.org/thread.html/eb841f610ef2c191b8d00b6c07b2eab513da2e4eb2d7da5c5e6846f4@%3Cdev.flink.apache.org%3E > > [4] https://github.com/apache/flink-playgrounds > > [5] https://hub.docker.com/u/apache > > [6] https://github.com/ververica/sql-training/ > > [7] https://hub.docker.com/r/fhueske/flink-sql-client-training-1.7.2 >