Thank you Eric to describe the problem.
I have multiple small comments, trying to separate them.
I. separated vs in-build container image creation
> The disadvantages are:
>
> 1. Require developer to have access to docker.
> 2. Default build takes longer.
These are not the only disadvantages (IMHO) as I wrote it in in the
previous thread and the issue [1]
Using in-build container image creation doesn't enable:
1. to modify the image later (eg. apply security fixes to the container
itself or apply improvements for the startup scripts)
2. create images for older releases (eg. hadoop 2.7.1)
I think there are two kind of images:
a) images for released artifacts
b) developer images
I would prefer to manage a) with separated branch repositories but b)
with (optional!) in-build process.
II. Agree with Steve. I think it's better to make it optional as most of
the time it's not required. I think it's better to support the default
dev build with the default settings (=just enough to start)
III. Maven best practices
(https://dzone.com/articles/maven-profile-best-practices)
I think this is a good article. But this is not against profiles but
creating multiple versions from the same artifact with the same name
(eg. jdk8/jdk11). In Hadoop, profiles are used to introduce optional
steps. I think it's fine as the maven lifecycle/phase model is very
static (compare it with the tree based approach in Gradle).
Marton
[1]: https://issues.apache.org/jira/browse/HADOOP-16091
On 3/13/19 11:24 PM, Eric Yang wrote:
> Hi Hadoop developers,
>
> In the recent months, there were various discussions on creating docker build
> process for Hadoop. There was convergence to make docker build process
> inline in the mailing list last month when Ozone team is planning new
> repository for Hadoop/ozone docker images. New feature has started to add
> docker image build process inline in Hadoop build.
> A few lessons learnt from making docker build inline in YARN-7129. The build
> environment must have docker to have a successful docker build. BUILD.txt
> stated for easy build environment use Docker. There is logic in place to
> ensure that absence of docker does not trigger docker build. The inline
> process tries to be as non-disruptive as possible to existing development
> environment with one exception. If docker’s presence is detected, but user
> does not have rights to run docker. This will cause the build to fail.
>
> Now, some developers are pushing back on inline docker build process because
> existing environment did not make docker build process mandatory. However,
> there are benefits to use inline docker build process. The listed benefits
> are:
>
> 1. Source code tag, maven repository artifacts and docker hub artifacts can
> all be produced in one build.
> 2. Less manual labor to tag different source branches.
> 3. Reduce intermediate build caches that may exist in multi-stage builds.
> 4. Release engineers and developers do not need to search a maze of build
> flags to acquire artifacts.
>
> The disadvantages are:
>
> 1. Require developer to have access to docker.
> 2. Default build takes longer.
>
> There is workaround for above disadvantages by using -DskipDocker flag to
> avoid docker build completely or -pl !modulename to bypass subprojects.
> Hadoop development did not follow Maven best practice because a full Hadoop
> build requires a number of profile and configuration parameters. Some
> evolutions are working against Maven design and require fork of separate
> source trees for different subprojects and pom files. Maven best practice
> (https://dzone.com/articles/maven-profile-best-practices) has explained that
> do not use profile to trigger different artifact builds because it will
> introduce maven artifact naming conflicts on maven repository using this
> pattern. Maven offers flags to skip certain operations, such as -DskipTests
> -Dmaven.javadoc.skip=true -pl or -DskipDocker. It seems worthwhile to make
> some corrections to follow best practice for Hadoop build.
>
> Some developers have advocated for separate build process for docker images.
> We need consensus on the direction that will work best for Hadoop development
> community. Hence, my questions are:
>
> Do we want to have inline docker build process in maven?
> If yes, it would be developer’s responsibility to pass -DskipDocker flag to
> skip docker. Docker is mandatory for default build.
> If no, what is the release flow for docker images going to look like?
>
> Thank you for your feedback.
>
> Regards,
> Eric
>
---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org