Thanks Eric the answers.
If I understood well, these are two proposals (use the same repository, use inline build). I created separated jiras for both of them where we can discuss the technical details: https://issues.apache.org/jira/browse/HADOOP-16092 https://issues.apache.org/jira/browse/HADOOP-16091 Until the implementation of the jiras we can use the existing approach, but (again) I am fine with switching to any newer approach anytime. The only thing what we need is the availability of the images during any transition. I started to document the current state in the wiki to make the discussion easier. https://cwiki.apache.org/confluence/display/HADOOP/Container+support https://cwiki.apache.org/confluence/display/HADOOP/Ozone+Container+support Marton On 1/31/19 8:00 PM, Eric Yang wrote: > 1, 3. There are 38 Apache projects hosting docker images on Docker hub using > Apache Organization. By browsing Apache github mirror. There are only 7 > projects using a separate repository for docker image build. Popular > projects official images are not from Apache organization, such as zookeeper, > tomcat, httpd. We may not disrupt what other Apache projects are doing, but > it looks like inline build process is widely employed by majority of projects > such as Nifi, Brooklyn, thrift, karaf, syncope and others. The situation > seems a bit chaotic for Apache as a whole. However, Hadoop community can > decide what is best for Hadoop. My preference is to remove ozone from source > tree naming, if Ozone is intended to be subproject of Hadoop for long period > of time. This enables Hadoop community to host docker images for various > subproject without having to check out several source tree to trigger a grand > build. However, inline build process seems more popular than separated > process. Hence, I highly recommend making docker build inline if possible. > > 2. I think open an INFRA ticket, and there are Jenkins users who can > configure the job to run on nodes that have Apache repo credential. > > 4. The docker image name maps to maven project name. Hence, if it is > Hadoop-ozone as project name. The convention automatically follows the maven > artifact name with option to customize. I think it is reasonable and it > automatically tagged with the same maven project version, which minimize > version number management between maven and docker. > > Regards, > Eric > > On 1/31/19, 8:59 AM, "Elek, Marton" <e...@apache.org> wrote: > > > Hi Eric, > > Thanks for the answers > > 1. > > > Hadoop-docker-ozone.git source tree naming seems to create a unique > process for Ozone. > > Not at all. We would like to follow the existing practice which is > established in HADOOP-14898. In HDDS-851 we discussed why we need two > separated repositories for hadoop/ozone: because the limitation of the > dockerhub branch/tag mapping. > > I am 100% open to switch to use an other approach. I would suggest to > create a JIRA for that as it requires code modification in the > docker-hadoop-* branches. > > > 2. > > > Flagging automated build on dockerhub seems conflicts with Apache > release policy. > > Honestly I don't know. It was discussed in HADOOP-14989 and the > connected INFRA ticket and there was no arguments against it. Especially > as we just followed the existing practice and we just followed the > practice which is started by other projects. > > Now I checked again the docker related INFRA tickets it seems that we > have two other practice since than: > > 1) build docker image on the jenkins (is it compliant?) > 2) get permission to push to the apache/... from local. > > You suggested to the the second one. Do you have more information how is > it possible? How and who can request permission to push the > apache/hadoop for example? > > > 3. > > From one point of view, publishing existing, voted releases in docker > images is something like to repackage it. But you may have right and > this is wrong because it should be handled as separated releases. > > Do you know any official ASF wiki/doc/mail discussion about managing > docker images? If not, I would suggest to create a new wiki/doc as it > seems that we have no clear answer which is the most compliant way to do > it. > > 4. > > Thank you the suggestions to use dockerhub/own namespace to stage docker > images during the build. Sounds good to me. But I also wrote some other > problems in my previous mail (3 b,c,d), this is is just one (3/a). Do > you have any suggestion to solve the other problems? > > * Updating existing images (for example in case of an ssl bug, rebuild > all the existing images with exactly the same payload but updated base > image/os environment) > > * Creating image for older releases (We would like to provide images, > for hadoop 2.6/2.7/2.7/2.8/2.9. Especially for doing automatic testing > with different versions). > > Thanks a lot, > Marton > > > On 1/30/19 6:50 PM, Eric Yang wrote: > > Hi Marton, > > > > Hi Marton, > > > > Flagging automated build on dockerhub seems conflicts with Apache > release policy. The vote and release process are manual processes of Apache > Way. Therefore, 3 b)-3 d) improvement will be out of reach unless policy > changes. > > > > YARN-7129 is straight forward by using dockerfile-maven-plugin to build > docker image locally. It also checks for existence of /var/run/docker.sock > to ensure docker is running. This allows the docker image to build in > developer sandbox, if the developer sandbox mounts the host > /var/run/docker.sock. Maven deploy can configure repository location and > authentication credential using ~/.docker/config.json and maven settings.xml. > This can upload release candidate image to release manager's dockerhub > account for release vote. Once the vote passes, the image can be pushed to > Apache official dockerhub repository by release manager or an Apache Jenkin > job to tag the image and push to Apache account. > > > > Ozone image and application catalog image are in similar situation that > test image can be built and tested locally. The official voted artifacts can > be uploaded to Apache dockerhub account. Hence, less variant of the same > procedure will be great. Hadoop-docker-ozone.git source tree naming seems to > create a unique process for Ozone. I think it would be preferable to call > the Hadoop-docker.git that comprise all docker image builds or > dockerfile-maven-plugin approach. > > > > Regards, > > Eric > > > > On 1/30/19, 12:56 AM, "Elek, Marton" <e...@apache.org> wrote: > > > > Thanks Eric the suggestions. > > > > Unfortunately (as Anu wrote it) our use-case is slightly different. > > > > It was discussed in HADOOP-14898 and HDDS-851 but let me summarize > the > > motivation: > > > > We would like to upload containers to the dockerhub for each > releases > > (eg: apache/hadoop:3.2.0) > > > > According to the Apache release policy, it's not allowed, to publish > > snapshot builds (=not voted by PMC) outside of the developer > community. > > > > 1. We started to follow the pattern which is used by other Apache > > projects: docker containers are just different packaging of the > already > > voted binary releases. Therefore we create the containers from the > voted > > releases. (See [1] as an example) > > > > 2. With separating the build of the source code and the docker > image we > > get additional benefits: for example we can rebuild the images in > case > > of a security problem in the underlying container OS. This is just > a new > > empty commit on the branch and the original release will be > repackaged. > > > > 3. Technically it would be possible to add the Dockerfile to the > source > > tree and publish the docker image together with the release by the > > release manager but it's also problematic: > > > > a) there is no easy way to stage the images for the vote > > b) we have no access to the apache dockerhub credentials > > c) it couldn't be flagged as automated on dockerhub > > d) It couldn't support the critical updates as I wrote in (2.). > > > > So the easy way what we found is ask INFRA to register a branch to > the > > dockerhub to use for the image creation. The build/packaging will be > > done by the dockerhub but only released artifacts will be included. > > Because the limitation of the dockerhub to set a map between branch > > names and tags, we need a new repository instead of the branch (see > the > > comments in HDDS-851 for more details). > > > > We also have a different use case to build developer images to > create a > > test cluster. These images will never be uploaded to the hub. We > have a > > Dokcerfile in the source tree for this use case (see HDDS-872). And > > thank you very much the hint, I will definitely check how YARN-7129 > can > > do it and will try to learn from it. > > > > Thanks, > > Marton > > > > > > [1]: https://github.com/apache/hadoop/tree/docker-hadoop-3 > > > > > > > > On 1/30/19 2:50 AM, Anu Engineer wrote: > > > Marton please correct me I am wrong, but I believe that without > this branch it is hard for us to push to Apache DockerHub. This allows for > Apache account integration and dockerHub. > > > Does YARN publish to the Docker Hub via Apache account? > > > > > > > > > Thanks > > > Anu > > > > > > > > > On 1/29/19, 4:54 PM, "Eric Yang" <ey...@hortonworks.com> wrote: > > > > > > By separating Hadoop docker related build into a separate git > repository have some slippery slope. It is harder to synchronize the changes > between two separate source trees. There is multi-steps process to build > jar, tarball, and docker images. This might be problematic to reproduce. > > > > > > It would be best to arrange code such that docker image build > process can be invoked as part of maven build process. The profile is > activated only if docker is installed and running on the environment. This > allows to produce jar, tarball, and docker images all at once without > hindering existing build procedure. > > > > > > YARN-7129 is one of the examples that making a subproject in > YARN to build a docker image that can run in YARN. It automatically detects > presence of docker and build docker image when docker is available. If > docker is not running, the subproject skips and proceed to next sub-project. > Please try out YARN-7129 style of build process, and see this is a possible > solution to solve docker image generation issue? Thanks > > > > > > Regards, > > > Eric > > > > > > On 1/29/19, 3:44 PM, "Arpit Agarwal" > <aagar...@cloudera.com.INVALID> wrote: > > > > > > I’ve requested a new repo hadoop-docker-ozone.git in > gitbox. > > > > > > > > > > On Jan 22, 2019, at 4:59 AM, Elek, Marton > <e...@apache.org> wrote: > > > > > > > > > > > > > > > > TLDR; > > > > > > > > I proposed to create a separated git repository for > ozone docker images > > > > in HDDS-851 (hadoop-docker-ozone.git) > > > > > > > > If there is no objections in the next 3 days I will ask > an Apache Member > > > > to create the repository. > > > > > > > > > > > > > > > > > > > > LONG VERSION: > > > > > > > > In HADOOP-14898 multiple docker containers and helper > scripts are > > > > created for Hadoop. > > > > > > > > The main goal was to: > > > > > > > > 1.) help the development with easy-to-use docker images > > > > 2.) provide official hadoop images to make it easy to > test new features > > > > > > > > As of now we have: > > > > > > > > - apache/hadoop-runner image (which contains the > required dependency > > > > but no hadoop) > > > > - apache/hadoop:2 and apache/hadoop:3 images (to try > out latest hadoop > > > > from 2/3 lines) > > > > > > > > The base image to run hadoop (apache/hadoop-runner) is > also heavily used > > > > for Ozone distribution/development. > > > > > > > > The Ozone distribution contains docker-compose based > cluster definitions > > > > to start various type of clusters and scripts to do > smoketesting. (See > > > > HADOOP-16063 for more details). > > > > > > > > Note: I personally believe that these definitions help > a lot to start > > > > different type of clusters. For example it could be > tricky to try out > > > > router based federation as it requires multiple HA > clusters. But with a > > > > simple docker-compose definition [1] it could be > started under 3 > > > > minutes. (HADOOP-16063 is about creating these > definitions for various > > > > hdfs/yarn use cases) > > > > > > > > As of now we have dedicated branches in the hadoop git > repository for > > > > the docker images (docker-hadoop-runner, > docker-hadoop-2, > > > > docker-hadoop-3). It turns out that a separated > repository would be more > > > > effective as the dockerhub can use only full branch > names as tags. > > > > > > > > We would like to provide ozone docker images to make > the evaluation as > > > > easy as 'docker run -d apache/hadoop-ozone:0.3.0', > therefore in HDDS-851 > > > > we agreed to create a separated repository for the > hadoop-ozone docker > > > > images. > > > > > > > > If this approach works well we can also move out the > existing > > > > docker-hadoop-2/docker-hadoop-3/docker-hadoop-runner > branches from > > > > hadoop.git to an other separated hadoop-docker.git > repository) > > > > > > > > Please let me know if you have any comments, > > > > > > > > Thanks, > > > > Marton > > > > > > > > 1: see > > > > > https://github.com/flokkr/runtime-compose/tree/master/hdfs/routerfeder > > > > as an example > > > > > > > > > --------------------------------------------------------------------- > > > > To unsubscribe, e-mail: > hdfs-dev-unsubscr...@hadoop.apache.org > > > > For additional commands, e-mail: > hdfs-dev-h...@hadoop.apache.org > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: > hdfs-dev-unsubscr...@hadoop.apache.org > > > For additional commands, e-mail: > hdfs-dev-h...@hadoop.apache.org > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: > common-dev-unsubscr...@hadoop.apache.org > > > For additional commands, e-mail: > common-dev-h...@hadoop.apache.org > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: common-dev-h...@hadoop.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org