tianon commented on pull request #14630:
URL: https://github.com/apache/flink/pull/14630#issuecomment-762526116


   Ok, let me try to phrase this in a simpler way.
   
   
https://github.com/apache/flink-docker/blob/d2ff00e0176e09cdbbde76412026054016a8551c/1.12/scala_2.12-java11-debian/docker-entrypoint.sh
 is almost 200 lines long with a lot of little bits of duplicated code.
   
   How much of that is actually specific to using Flink in Docker, and how much 
of that is something that users *outside* Docker might be interested in / use?
   
   It seems like a lot of it is just (explicit, separate) wrappers around other 
existing Flink scripts, and even many of those are sharing common behavior 
(`copy_plugins_if_required`, for example).
   
   Could they be combined in some way such that the "Docker specific" behavior 
happens once and the upstream scripts are then just standard fallthrough 
behavior?
   
   For example, right now users do commands like `docker run ... flink 
history-server ...` in order to run `historyserver.sh`.  What I would expect 
instead is that `$FLINK_HOME/bin` is in `PATH`, that `docker-entrypoint.sh` 
looks at `$1` to do any extra Docker-specific initialization for 
`historyserver.sh`, and users then do `docker run ... flink historyserver.sh 
...` (where now, the "command" users are running in the container matches the 
exact name of the same command they would run for a standard install of Flink).
   
   Rough mockup:
   
   ```bash
   copy_plugins_if_required
   
   case "$1" in
        standalone-job.sh) ... ;;
        historyserver.sh) ... ;;
        ...
   esac
   
   exec "$@"
   ```
   
   If having something that allows users to run something "prettier" like 
`flink history-server ...` is a desired goal, shouldn't that be a standard 
wrapper script in the upstream distribution, rather than something created 
specifically just for the Docker image?  That could even be implemented as a 
set of symlinks to the official scripts with their "prettier" preferred names, 
if you want to keep the maintenance burden low.
   
   ---
   
   As another explicit example of complexity, instead of disabling jemalloc via 
an explicit "command" which then has to be stripped and is order-dependent, if 
that were set via environment variable the logic would be much simpler:
   
   ```bash
   if [ -z "${FLINK_DISABLE_JEMALLOC:-}" ]; then
        export LD_PRELOAD=$LD_PRELOAD:/usr/lib/x86_64-linux-gnu/libjemalloc.so
   fi
   ```
   
   (To make this part of an "official" script in the actual distribution you'll 
likely want to make it more forgiving / detecting of the path to 
`libjemalloc.so`, but you get the idea.)
   
   ---
   
   This isn't as much a matter of "acceptable" vs "unacceptable" but rather 
that the script is consistently growing in complexity and I'm trying to 
understand why it is not simpler and/or which parts of it are actually 
Docker-specific and necessary, because it currently reads as a whole new 
launcher script for Flink and that seems like something that ought to be part 
of the Flink distribution instead of something Docker-specific.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to