Hi, all, Thanks for the reply, Andrey!
I have filed two new tickets tracking the problems: 1. FLINK-17033 <https://issues.apache.org/jira/browse/FLINK-17033> for upgrading base Java Docker image, I pointed out some other problems the openjdk:8-jre-alpine could have in the ticket‘s description. 2. FLINK-17034 <https://issues.apache.org/jira/browse/FLINK-17034> for suggesting executing the container CMD under TINI. Regards, Canbin Zheng Andrey Zagrebin <azagre...@apache.org> 于2020年4月7日周二 下午4:58写道: > Hi all, > > Thanks for the further feedback Niels and Canbin. > > @Niels > > I agree with Till, the comments about docker tags are valid concerns and we > can discuss them in dedicated ML threads > in parallel or after the general unification of Dockerfiles suggested by > this FLIP. > > One thing to add about point 4. The native Kubernetes integration does not > support a job mode at the moment. > This is not only about the image. As I understand, even if you pack the job > artefacts into the image, the native Kubernetes integration will start a > session cluster. > This will be a follow-up for the native Kubernetes integration. > cc @Yang Wang > > @Canbin > > I think you raise valid concerns. It makes sense to create JIRA issues for > them. > One for the alpine image problem and one to suggest the TINI as a blocker > for FLINK-15843 <https://issues.apache.org/jira/browse/FLINK-15843> and > slow pod shutdown. > We can discuss and address them in parallel or after the general > unification of Dockerfiles suggested by this FLIP. > > I will start a separate voting thread for this FLIP. > > Cheers, > Andrey > > > On Mon, Apr 6, 2020 at 5:49 PM Canbin Zheng <felixzhen...@gmail.com> > wrote: > > > Hi, all > > > > Thanks a lot for this FLIP and all the fruitable discussion. I am not > sure > > whether the following questions are in the scope of this FLIP, but I > still > > expect your reply: > > > > 1. Which docker base image do we plan to use for Java? As far as I > > see, openjdk:8-jre-alpine[1] is not officially supported by the > OpenJDK > > project anymore; openjdk:8-jre is larger than openjdk:8-jre-slim in > size so > > that we use the latter one in our internal branch and it works fine > so far. > > 2. Is it possible that we execute the container CMD under *TINI*[2] > > instead of the shell for better hygiene? As far as I see, the > container of > > the JM or TMs is running in the shell form and it could not receive > the > > *TERM* signal when the pod is deleted[3]. Some of the problems are as > > follows: > > - The JM and the TMs could have no chance of cleanup, I used to > > create FLINK-15843[4] for tracking this problem. > > - The pod could take a long time(up to 40 seconds) to be deleted > > after the K8s API Server receives the deletion request. > > > > At the moment, we use *TINI* in our internal branch for the > > native K8s setup and it solves the problems mentioned above. > > > > [1] > > > > > https://github.com/docker-library/docs/blob/master/openjdk/README.md#supported-tags-and-respective-dockerfile-links > > > > > https://github.com/docker-library/openjdk/commit/3eb0351b208d739fac35345c85e3c6237c2114ec#diff-f95ffa3d1377774732c33f7b8368e099 > > [2] > > https://github.com/krallin/tini > > [3] > > https://docs.docker.com/engine/reference/commandline/kill/ > > [4] > > https://issues.apache.org/jira/browse/FLINK-15843 > > > > Regards, > > Canbin Zheng > > > > Till Rohrmann <trohrm...@apache.org> 于2020年4月6日周一 下午5:34写道: > > > >> Thanks for the feedback Niels. This is very helpful. > >> > >> 1. I agree `flink:latest` is nice to get started but in the long run > >> people > >> will want to pin their dependencies to a specific Flink version. I think > >> the fix will happen as part of FLINK-15794. > >> > >> 2. SNAPSHOT docker images will be really helpful for developers as well > as > >> users who want to use the latest features. I believe that this will be a > >> follow-up of this FLIP. > >> > >> 3. The goal of FLIP-111 is to create an image which allows to start a > >> session as well as job cluster. Hence, I believe that we will solve this > >> problem soon. > >> > >> 4. Same as 3. The new image will also contain the native K8s integration > >> so > >> that there is no need to create a special image modulo the artifacts you > >> want to add. > >> > >> Additional notes: > >> > >> 1. I agree that one log makes it harder to separate different execution > >> attempts or different tasks. However, on the other hand, it gives you an > >> overall picture of what's happening in a Flink process. If things were > >> split apart, then it might become super hard to detect problems in the > >> runtime which affect the user code to fail or vice versa, for example. > In > >> general cross correlation will be harder. I guess a solution could be to > >> make this configurable. In any case, we should move the discussion about > >> this topic into a separate thread. > >> > >> Cheers, > >> Till > >> > >> On Mon, Apr 6, 2020 at 10:40 AM Niels Basjes <ni...@basjes.nl> wrote: > >> > >> > Hi all, > >> > > >> > Sorry for jumping in at this late point of the discussion. > >> > I see a lot of things I really like and I would like to put my "needs" > >> and > >> > observations here too so you take them into account (where possible). > >> > I suspect that there will be overlap with things you already have > taken > >> > into account. > >> > > >> > 1. No more 'flink:latest' docker image tag. > >> > Related to https://issues.apache.org/jira/browse/FLINK-15794 > >> > What I have learned is that the 'latest' version of a docker image > >> only > >> > makes sense IFF this is an almost standalone thing. > >> > So if I have a servlet that does something in isolation (like my > >> hobby > >> > project https://hub.docker.com/r/nielsbasjes/yauaa ) then 'latest' > >> > makes > >> > sense. > >> > With Flink you have the application code and all nodes in the > cluster > >> > that are depending on each other and as such must run the exact > same > >> > versions of the base software. > >> > So if you run flink in a cluster (local/yarn/k8s/mesos/swarm/...) > >> where > >> > the application and the nodes inter communicate and closely depend > on > >> > each > >> > other then 'latest' is a bad idea. > >> > 1. Assume I have an application built against the Flink N api > and > >> the > >> > cluster downloads the latest which is also Flink N. > >> > Then a week later Flink N+1 is released and the API I use > changes > >> > (Deprecated) > >> > and a while later Flink N+2 is released and the deprecated API > is > >> > removed: Then my application no longer works even though I have > >> > not changed > >> > anything. > >> > So I want my application to be 'pinned' to the exact version I > >> built > >> > it with. > >> > 2. I have a running cluster with my application and cluster > >> running > >> > Flink N. > >> > I add some additional nodes and the new nodes pick up the Flink > >> N+1 > >> > image ... now I have a cluster with mixed versions. > >> > 3. The version of flink is really the "Flink+Scala" version > pair. > >> > If you have the right flink but the wrong scala you get really > >> nasty > >> > errors: https://issues.apache.org/jira/browse/FLINK-16289 > >> > > >> > 2. Deploy SNAPSHOT docker images (i.e. something like > >> > *flink:1.11-SNAPSHOT_2.12*) . > >> > More and more use cases will be running on the code delivered via > >> Docker > >> > images instead of bare jar files. > >> > So if a "SNAPSHOT" is released and deployed into a 'staging' maven > >> repo > >> > (which may be locally on the developers workstation) then it is my > >> > opinion > >> > that at the same moment a "SNAPSHOT" docker image should be > >> > created/deployed. > >> > Each time a "SNAPSHOT" docker image is released this will overwrite > >> the > >> > previous "SNAPSHOT". > >> > If the final version is released the SNAPSHOTs of that version > >> > can/should be removed. > >> > This will make testing in clusters a lot easier. > >> > Also building a local fix and then running it locally will work > >> without > >> > additional modifications to the code. > >> > > >> > 3. Support for a 'single application cluster' > >> > I've been playing around with the S3 plugin and what I have found > is > >> > that this essentially requires all nodes to have full access to the > >> > credentials needed to connect to S3. > >> > This essentially means that a multi-tenant setup is not possible in > >> > these cases. > >> > So I think the single application cluster should be a feature > >> available > >> > in all cases. > >> > > >> > 4. I would like a native-kubernetes-single-application base image. > >> > I can then create a derived image where I only add the jar of my > >> > application. > >> > My desire is that I can then create a k8s yaml file for kubectl > >> > that adds the needed configs/secrets/arguments/environment > variables > >> and > >> > starts the cluster and application. > >> > Because the native kubernetes support makes it automatically scale > >> based > >> > on the application this should 'just work'. > >> > > >> > Additional note: > >> > > >> > 1. Job/Task attempt logging instead of task manager logging. > >> > *I realize this has nothing to do with the docker images* > >> > I found something "hard to work with" while running some tests last > >> > week. > >> > The logging is done to a single log for the task manager. > >> > So if I have multiple things running in the single task manager > then > >> the > >> > logs are mixed together. > >> > Also several attempts of the same task are mixed which makes it > very > >> > hard to find out 'what went wrong'. > >> > > >> > > >> > > >> > On Fri, Apr 3, 2020 at 4:27 PM Ufuk Celebi <u...@apache.org> wrote: > >> > > >> > > Thanks for the summary, Andrey. Good idea to link Patrick's document > >> from > >> > > the FLIP as a future direction so it doesn't get lost. Could you > make > >> > sure > >> > > to revive that discussion when FLIP-111 nears an end? > >> > > > >> > > This is good to go on my part. +1 to start the VOTE. > >> > > > >> > > > >> > > @Till, @Yang: Thanks for the clarification with the output > >> redirection. I > >> > > didn't see that. The concern with the `tee` approach is that the > file > >> > would > >> > > grow indefinitely. I think we can solve this with regular logging by > >> > > redirecting stderr to ERROR log level, but I'm not sure. We can look > >> at a > >> > > potential solution when we get to that point. :-) > >> > > > >> > > > >> > > > >> > > On Fri, Apr 3, 2020 at 3:36 PM Andrey Zagrebin < > azagre...@apache.org> > >> > > wrote: > >> > > > >> > > > Hi everyone, > >> > > > > >> > > > Patrick and Ufuk, thanks a lot for more ideas and suggestions! > >> > > > > >> > > > I have updated the FLIP according to the current state of > >> discussion. > >> > > > Now it also contains the implementation steps and future > follow-ups. > >> > > > Please, review if there are any concerns. > >> > > > The order of the steps aims for keeping Flink releasable at any > >> point > >> > if > >> > > > something does not have enough time to get in. > >> > > > > >> > > > It looks that we are reaching mostly a consensus for the open > >> > questions. > >> > > > There is also a list of items, which have been discussed in this > >> > thread, > >> > > > and short summary below. > >> > > > As soon as there are no concerns, I will create a voting thread. > >> > > > > >> > > > I also added some thoughts for further customising logging setup. > >> This > >> > > may > >> > > > be an optional follow-up > >> > > > which is additional to the default logging into files for Web UI. > >> > > > > >> > > > # FLIP scope > >> > > > The focus is users of the official releases. > >> > > > Create docs for how to use the official docker image. > >> > > > Remove other Dockerfiles in Flink repo. > >> > > > Rely on running the official docker image in different modes > >> (JM/TM). > >> > > > Customise running the official image with env vars (This should > >> > minimise > >> > > > manual manipulating of local files and creation of a custom > image). > >> > > > > >> > > > # Base oficial image > >> > > > > >> > > > ## Java versions > >> > > > There is a separate effort for this: > >> > > > https://github.com/apache/flink-docker/pull/9 > >> > > > > >> > > > # Run image > >> > > > > >> > > > ## Entry point modes > >> > > > JM session, JM job, TM > >> > > > > >> > > > ## Entry point config > >> > > > We use env vars for this, e.g. FLINK_PROPERTIES and > >> > > ENABLE_BUILT_IN_PLUGINS > >> > > > > >> > > > ## Flink config options > >> > > > We document the existing FLINK_PROPERTIES env var to override > config > >> > > > options in flink-conf.yaml. > >> > > > Then later, we do not need to expose and handle any other special > >> env > >> > > vars > >> > > > for config options (address, port etc). > >> > > > The future plan is to make Flink process configurable by env vars, > >> e.g. > >> > > > 'some.yaml.option: val' -> FLINK_SOME_YAML_OPTION=val > >> > > > > >> > > > ## Extra files: jars, custom logging properties > >> > > > We can provide env vars to point to custom locations, e.g. in > >> mounted > >> > > > volumes. > >> > > > > >> > > > # Extend image > >> > > > > >> > > > ## Python/hadoop versions, activating certain libs/plugins > >> > > > Users can install extra dependencies and change configs in their > >> custom > >> > > > image which extends our base image. > >> > > > > >> > > > # Logging > >> > > > > >> > > > ## Web UI > >> > > > Modify the *log4j-console.properties* to also output logs into the > >> > files > >> > > > for WebUI. Limit log file size. > >> > > > > >> > > > ## Container output > >> > > > Separate effort for proper split of Flink process stdout and > stderr > >> > into > >> > > > files and container output > >> > > > (idea with tee command: `program start-foreground &2>1 | tee > >> > > > flink-user-taskexecutor.out`) > >> > > > > >> > > > # Docker bash utils > >> > > > We are not going to expose it to users as an API. > >> > > > They should be able either to configure and run the standard entry > >> > point > >> > > > or the documentation should give short examples about how to > extend > >> and > >> > > > customise the base image. > >> > > > > >> > > > During the implementation, we will see if it makes sense to factor > >> out > >> > > > certain bash procedures > >> > > > to reuse them e.g. in custom dev versions of docker image. > >> > > > > >> > > > # Dockerfile / image for developers > >> > > > We keep it on our future roadmap. This effort should help to > >> understand > >> > > > what we can reuse there. > >> > > > > >> > > > Best, > >> > > > Andrey > >> > > > > >> > > > > >> > > > On Fri, Apr 3, 2020 at 12:57 PM Till Rohrmann < > trohrm...@apache.org > >> > > >> > > > wrote: > >> > > > > >> > > >> Hi everyone, > >> > > >> > >> > > >> just a small inline comment. > >> > > >> > >> > > >> On Fri, Apr 3, 2020 at 11:42 AM Ufuk Celebi <u...@apache.org> > >> wrote: > >> > > >> > >> > > >> > Hey Yang, > >> > > >> > > >> > > >> > thanks! See inline answers. > >> > > >> > > >> > > >> > On Fri, Apr 3, 2020 at 5:11 AM Yang Wang < > danrtsey...@gmail.com> > >> > > wrote: > >> > > >> > > >> > > >> > > Hi Ufuk, > >> > > >> > > > >> > > >> > > Thanks for make the conclusion and directly point out what > >> need to > >> > > be > >> > > >> > done > >> > > >> > > in > >> > > >> > > FLIP-111. I agree with you that we should narrow down the > scope > >> > and > >> > > >> focus > >> > > >> > > the > >> > > >> > > most important and basic part about docker image unification. > >> > > >> > > > >> > > >> > > (1) Extend the entrypoint script in apache/flink-docker to > >> start > >> > the > >> > > >> job > >> > > >> > >> cluster entry point > >> > > >> > > > >> > > >> > > I want to add a small requirement for the entry point script. > >> > > >> Currently, > >> > > >> > > for the native > >> > > >> > > K8s integration, we are using the apache/flink-docker image, > >> but > >> > > with > >> > > >> > > different entry > >> > > >> > > point("kubernetes-entry.sh"). Generate the java cmd in > >> > > KubernetesUtils > >> > > >> > and > >> > > >> > > run it > >> > > >> > > in the entry point. I really hope it could merge to > >> > > >> apache/flink-docker > >> > > >> > > "docker-entrypoint.sh". > >> > > >> > > > >> > > >> > > >> > > >> > The script [1] only adds the FLINK_CLASSPATH env var which > seems > >> > > >> generally > >> > > >> > reasonable to me. But since principled classpath and entrypoint > >> > > >> > configuration is somewhat related to the follow-up improvement > >> > > >> proposals, I > >> > > >> > could also see this being done after FLIP-111. > >> > > >> > > >> > > >> > > >> > > >> > > (2) Extend the example log4j-console configuration > >> > > >> > >> => support log retrieval from the Flink UI out of the box > >> > > >> > > > >> > > >> > > If you mean to update the > >> > "flink-dist/conf/log4j-console.properties" > >> > > >> to > >> > > >> > > support console and > >> > > >> > > local log files. I will say "+1". But we need to find a > proper > >> way > >> > > to > >> > > >> > make > >> > > >> > > stdout/stderr output > >> > > >> > > both available for console and log files. Maybe till's > proposal > >> > > could > >> > > >> > help > >> > > >> > > to solve this. > >> > > >> > > "`program &2>1 | tee flink-user-taskexecutor.out`" > >> > > >> > > > >> > > >> > > >> > > >> > I think we can simply add a rolling file appender with a limit > on > >> > the > >> > > >> log > >> > > >> > size. > >> > > >> > > >> > > >> > I think this won't solve Yang's concern. What he wants to > >> achieve is > >> > > >> that > >> > > >> STDOUT and STDERR go to STDOUT and STDERR as well as into some > >> *.out > >> > and > >> > > >> *.err file which are accessible from the web ui. I don't think > that > >> > log > >> > > >> appender will help with this problem. > >> > > >> > >> > > >> Cheers, > >> > > >> Till > >> > > >> > >> > > >> > >> > > >> > – Ufuk > >> > > >> > > >> > > >> > [1] > >> > > >> > > >> > > >> > > >> > > >> > >> > > > >> > > >> > https://github.com/apache/flink/blob/master/flink-dist/src/main/flink-bin/kubernetes-bin/kubernetes-entry.sh > >> > > >> > > >> > > >> > >> > > > > >> > > > >> > > >> > > >> > -- > >> > Best regards / Met vriendelijke groeten, > >> > > >> > Niels Basjes > >> > > >> > > >