tianon commented on pull request #14630: URL: https://github.com/apache/flink/pull/14630#issuecomment-762526116
Ok, let me try to phrase this in a simpler way. https://github.com/apache/flink-docker/blob/d2ff00e0176e09cdbbde76412026054016a8551c/1.12/scala_2.12-java11-debian/docker-entrypoint.sh is almost 200 lines long with a lot of little bits of duplicated code. How much of that is actually specific to using Flink in Docker, and how much of that is something that users *outside* Docker might be interested in / use? It seems like a lot of it is just (explicit, separate) wrappers around other existing Flink scripts, and even many of those are sharing common behavior (`copy_plugins_if_required`, for example). Could they be combined in some way such that the "Docker specific" behavior happens once and the upstream scripts are then just standard fallthrough behavior? For example, right now users do commands like `docker run ... flink history-server ...` in order to run `historyserver.sh`. What I would expect instead is that `$FLINK_HOME/bin` is in `PATH`, that `docker-entrypoint.sh` looks at `$1` to do any extra Docker-specific initialization for `historyserver.sh`, and users then do `docker run ... flink historyserver.sh ...` (where now, the "command" users are running in the container matches the exact name of the same command they would run for a standard install of Flink). Rough mockup: ```bash copy_plugins_if_required case "$1" in standalone-job.sh) ... ;; historyserver.sh) ... ;; ... esac exec "$@" ``` If having something that allows users to run something "prettier" like `flink history-server ...` is a desired goal, shouldn't that be a standard wrapper script in the upstream distribution, rather than something created specifically just for the Docker image? That could even be implemented as a set of symlinks to the official scripts with their "prettier" preferred names, if you want to keep the maintenance burden low. --- As another explicit example of complexity, instead of disabling jemalloc via an explicit "command" which then has to be stripped and is order-dependent, if that were set via environment variable the logic would be much simpler: ```bash if [ -z "${FLINK_DISABLE_JEMALLOC:-}" ]; then export LD_PRELOAD=$LD_PRELOAD:/usr/lib/x86_64-linux-gnu/libjemalloc.so fi ``` (To make this part of an "official" script in the actual distribution you'll likely want to make it more forgiving / detecting of the path to `libjemalloc.so`, but you get the idea.) --- This isn't as much a matter of "acceptable" vs "unacceptable" but rather that the script is consistently growing in complexity and I'm trying to understand why it is not simpler and/or which parts of it are actually Docker-specific and necessary, because it currently reads as a whole new launcher script for Flink and that seems like something that ought to be part of the Flink distribution instead of something Docker-specific. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org