Elek, Marton created HDDS-1516:
----------------------------------

             Summary: Move ozone-build container definition from dev-support 
and publish dedicated image
                 Key: HDDS-1516
                 URL: https://issues.apache.org/jira/browse/HDDS-1516
             Project: Hadoop Distributed Data Store
          Issue Type: Improvement
          Components: build
            Reporter: Elek, Marton


hadoop-ozone/dev-support/docker/Dockerfile directory contains a docker 
container definition to provide a generic build environment for ozone builds.

This container (or more preciously the improved version of this container) is 
used to run all the build commands inside the container on Jenkins. 

As of now it's uploaded as elek/ozone-build and works well (all github PR check 
builds are executed in this container).

I propose to move it to the hadoop-docker-ozone repo and publish an 
apache/ozone-buildenv docker image from it.

Note: there are two interesting tricks in the Dockerfile:

1.) a lot of users are created (from id=1 to id=4000)

Reason: the kerberized unit tests require real user. Jenkins uses the same uid 
inside the container as outside based on the number ( eg. -u 1000 flat) even if 
there is no real user created. And we can't predict what is the uid for the 
build process (in my jenkins it's 1000(elek) in builds.apache.org it's 
something between 400 and 500 (as I remember)).

./start-build-env.sh follows an approach to create a docker image on-demand 
(with only the required user). While it works well, I realized that the image 
creation is not cached very well on the jenkins and it may take >10 minutes for 
each build.

2.) The other question is the used maven repository. We prefer to separated the 
local maven repositories for parallel builds to avoid any conflict (If one 
build the the mvn install the other build may use that jar from the local maven 
repository). Docker can guarantee a strong separation but it also means that we 
need to download about 1GB files for each build (which is also very time 
consuming).

Earlier we started to use an approach to cache all the 3rd party jar files in 
the docker image itself.

As a result we will have a huge buildenv image (1-2G) but the image download is 
faster. Docker image can be downloaded as a few huge files and we don't need to 
download thousands of jar files one-by-one. The huge docker image also can be 
cached on the build machine without any risk.

With this approach we reduced the 10-20 minutes of the build time to 2-3 
minutes. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to