Hi everyone, Patrick and Ufuk, thanks a lot for more ideas and suggestions!
I have updated the FLIP according to the current state of discussion. Now it also contains the implementation steps and future follow-ups. Please, review if there are any concerns. The order of the steps aims for keeping Flink releasable at any point if something does not have enough time to get in. It looks that we are reaching mostly a consensus for the open questions. There is also a list of items, which have been discussed in this thread, and short summary below. As soon as there are no concerns, I will create a voting thread. I also added some thoughts for further customising logging setup. This may be an optional follow-up which is additional to the default logging into files for Web UI. # FLIP scope The focus is users of the official releases. Create docs for how to use the official docker image. Remove other Dockerfiles in Flink repo. Rely on running the official docker image in different modes (JM/TM). Customise running the official image with env vars (This should minimise manual manipulating of local files and creation of a custom image). # Base oficial image ## Java versions There is a separate effort for this: https://github.com/apache/flink-docker/pull/9 # Run image ## Entry point modes JM session, JM job, TM ## Entry point config We use env vars for this, e.g. FLINK_PROPERTIES and ENABLE_BUILT_IN_PLUGINS ## Flink config options We document the existing FLINK_PROPERTIES env var to override config options in flink-conf.yaml. Then later, we do not need to expose and handle any other special env vars for config options (address, port etc). The future plan is to make Flink process configurable by env vars, e.g. 'some.yaml.option: val' -> FLINK_SOME_YAML_OPTION=val ## Extra files: jars, custom logging properties We can provide env vars to point to custom locations, e.g. in mounted volumes. # Extend image ## Python/hadoop versions, activating certain libs/plugins Users can install extra dependencies and change configs in their custom image which extends our base image. # Logging ## Web UI Modify the *log4j-console.properties* to also output logs into the files for WebUI. Limit log file size. ## Container output Separate effort for proper split of Flink process stdout and stderr into files and container output (idea with tee command: `program start-foreground &2>1 | tee flink-user-taskexecutor.out`) # Docker bash utils We are not going to expose it to users as an API. They should be able either to configure and run the standard entry point or the documentation should give short examples about how to extend and customise the base image. During the implementation, we will see if it makes sense to factor out certain bash procedures to reuse them e.g. in custom dev versions of docker image. # Dockerfile / image for developers We keep it on our future roadmap. This effort should help to understand what we can reuse there. Best, Andrey On Fri, Apr 3, 2020 at 12:57 PM Till Rohrmann <trohrm...@apache.org> wrote: > Hi everyone, > > just a small inline comment. > > On Fri, Apr 3, 2020 at 11:42 AM Ufuk Celebi <u...@apache.org> wrote: > > > Hey Yang, > > > > thanks! See inline answers. > > > > On Fri, Apr 3, 2020 at 5:11 AM Yang Wang <danrtsey...@gmail.com> wrote: > > > > > Hi Ufuk, > > > > > > Thanks for make the conclusion and directly point out what need to be > > done > > > in > > > FLIP-111. I agree with you that we should narrow down the scope and > focus > > > the > > > most important and basic part about docker image unification. > > > > > > (1) Extend the entrypoint script in apache/flink-docker to start the > job > > >> cluster entry point > > > > > > I want to add a small requirement for the entry point script. > Currently, > > > for the native > > > K8s integration, we are using the apache/flink-docker image, but with > > > different entry > > > point("kubernetes-entry.sh"). Generate the java cmd in KubernetesUtils > > and > > > run it > > > in the entry point. I really hope it could merge to apache/flink-docker > > > "docker-entrypoint.sh". > > > > > > > The script [1] only adds the FLINK_CLASSPATH env var which seems > generally > > reasonable to me. But since principled classpath and entrypoint > > configuration is somewhat related to the follow-up improvement > proposals, I > > could also see this being done after FLIP-111. > > > > > > > (2) Extend the example log4j-console configuration > > >> => support log retrieval from the Flink UI out of the box > > > > > > If you mean to update the "flink-dist/conf/log4j-console.properties" to > > > support console and > > > local log files. I will say "+1". But we need to find a proper way to > > make > > > stdout/stderr output > > > both available for console and log files. Maybe till's proposal could > > help > > > to solve this. > > > "`program &2>1 | tee flink-user-taskexecutor.out`" > > > > > > > I think we can simply add a rolling file appender with a limit on the log > > size. > > > > I think this won't solve Yang's concern. What he wants to achieve is that > STDOUT and STDERR go to STDOUT and STDERR as well as into some *.out and > *.err file which are accessible from the web ui. I don't think that log > appender will help with this problem. > > Cheers, > Till > > > > – Ufuk > > > > [1] > > > > > https://github.com/apache/flink/blob/master/flink-dist/src/main/flink-bin/kubernetes-bin/kubernetes-entry.sh > > >