Hey all, thanks for the proposal and the detailed discussion. In particular, thanks to Andrey for starting this thread and to Patrick for the additional ideas in the linked Google doc.
I find many of the improvements proposed during the discussion (such as the unified entrypoint in Flink, proper configuration via environment variables, Dockerfiles for development, etc.) really important. At the same time, I believe that these improvements have quite a large scope and could be tackled independently as Till already suggested. I think we should ideally split the discussions for those improvements out of this thread and focus on the main target of FLIP-111. To me the major point of this FLIP is to consolidate existing Dockerfiles into apache/flink-docker and document typical usage scenarios (e.g. linking plugins, installing shaded Hadoop, running a job cluster, etc.). In order to achieve this, I think we could move forward as follows: (1) Extend the entrypoint script in apache/flink-docker to start the job cluster entry point => this is currently missing and would block removal of the Dockerfile in flink-container (2) Extend the example log4j-console configuration => support log retrieval from the Flink UI out of the box (3) Document typical usage scenarios in apache/flink-docker => this should replace the proposed flink_docker_utils helper (4) Remove the existing Dockerfiles from apache/flink I really like the convenience of a script such as flink_docker_utils, but I think we should avoid it for now, because most of the desired usage scenarios can be covered by documentation. After we have concluded (1)-(4) we can take a holistic look and identify what would benefit the most from such a script and how it would interact with the other planned improvements. I think this will give us a good basis to tackle the other major improvements that were proposed. – Ufuk On Thu, Apr 2, 2020 at 4:34 PM Patrick Lucas <patr...@ververica.com> wrote: > > Thanks Andrey for working on this, and everyone else for your feedback. > > This FLIP inspired me to discuss and write down some ideas I've had for a > while about configuring and running Flink (especially in Docker) that go > beyond the scope of this FLIP, but don't contradict what it sets out to do. > > The crux of it is that Flink should be maximally configurable using > environment variables, and not require manipulation of the filesystem (i.e. > moving/linking JARs or editing config files) in order to run in a large > majority of cases. And beyond that, particular for running Flink in Docker, > is that as much logic as possible should be a part of Flink itself and not, > for instance, in the docker-entrypoint.sh script. I've resisted adding > additional logic to the Flink Docker images except where necessary since > the beginning, and I believe we can get to the point where the only thing > the entrypoint script does is drop privileges before invoking a script > included in Flink. > > Ultimately, my ideal end-goal for running Flink in containers would fulfill > > the following points: > > > > - A user can configure all “start-time” aspects of Flink with > > environment variables, including additions to the classpath > > - Flink automatically adapts to the resources available to the > > container (such as what BashJavaUtils helps with today) > > - A user can include additional JARs using a mounted volume, or at > > image build time with convenient tooling > > - The role/mode (jobmanager, session) is specified as a command line > > argument, with a single entrypoint program sufficing for all uses of the > > image > > > > As a bonus, if we could eliminate some or most of the layers of shell > > scripts that are involved in starting a Flink server, perhaps by > > re-implementing this part of the stack in Java, and exec-ing to actually > > run Flink with the proper java CLI arguments, I think it would be a big win > > for the project. > > > You can read the rest of my notes here: > https://docs.google.com/document/d/1JCACSeDaqeZiXD9G1XxQBunwi-chwrdnFm38U1JxTDQ/edit > > On Wed, Mar 4, 2020 at 10:34 AM Andrey Zagrebin <azagre...@apache.org> > wrote: > > > Hi All, > > > > If you have ever touched the docker topic in Flink, you > > probably noticed that we have multiple places in docs and repos which > > address its various concerns. > > > > We have prepared a FLIP [1] to simplify the perception of docker topic in > > Flink by users. It mostly advocates for an approach of extending official > > Flink image from the docker hub. For convenience, it can come with a set of > > bash utilities and documented examples of their usage. The utilities allow > > to: > > > > - run the docker image in various modes (single job, session master, > > task manager etc) > > - customise the extending Dockerfile > > - and its entry point > > > > Eventually, the FLIP suggests to remove all other user facing Dockerfiles > > and building scripts from Flink repo, move all docker docs to > > apache/flink-docker and adjust existing docker use cases to refer to this > > new approach (mostly Kubernetes now). > > > > The first contributed version of Flink docker integration also contained > > example and docs for the integration with Bluemix in IBM cloud. We also > > suggest to maintain it outside of Flink repository (cc Markus Müller). > > > > Thanks, > > Andrey > > > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-111%3A+Docker+image+unification > >