Thank you Eric to describe the problem.

I have multiple small comments, trying to separate them.

I. separated vs in-build container image creation

> The disadvantages are:
>
> 1.  Require developer to have access to docker.
> 2.  Default build takes longer.


These are not the only disadvantages (IMHO) as I wrote it in in the
previous thread and the issue [1]

Using in-build container image creation doesn't enable:

1. to modify the image later (eg. apply security fixes to the container
itself or apply improvements for the startup scripts)
2. create images for older releases (eg. hadoop 2.7.1)

I think there are two kind of images:

a) images for released artifacts
b) developer images

I would prefer to manage a) with separated branch repositories but b)
with (optional!) in-build process.

II. Agree with Steve. I think it's better to make it optional as most of
the time it's not required. I think it's better to support the default
dev build with the default settings (=just enough to start)

III. Maven best practices

(https://dzone.com/articles/maven-profile-best-practices)

I think this is a good article. But this is not against profiles but
creating multiple versions from the same artifact with the same name
(eg. jdk8/jdk11). In Hadoop, profiles are used to introduce optional
steps. I think it's fine as the maven lifecycle/phase model is very
static (compare it with the tree based approach in Gradle).

Marton

[1]: https://issues.apache.org/jira/browse/HADOOP-16091

On 3/13/19 11:24 PM, Eric Yang wrote:
> Hi Hadoop developers,
> 
> In the recent months, there were various discussions on creating docker build 
> process for Hadoop.  There was convergence to make docker build process 
> inline in the mailing list last month when Ozone team is planning new 
> repository for Hadoop/ozone docker images.  New feature has started to add 
> docker image build process inline in Hadoop build.
> A few lessons learnt from making docker build inline in YARN-7129.  The build 
> environment must have docker to have a successful docker build.  BUILD.txt 
> stated for easy build environment use Docker.  There is logic in place to 
> ensure that absence of docker does not trigger docker build.  The inline 
> process tries to be as non-disruptive as possible to existing development 
> environment with one exception.  If docker’s presence is detected, but user 
> does not have rights to run docker.  This will cause the build to fail.
> 
> Now, some developers are pushing back on inline docker build process because 
> existing environment did not make docker build process mandatory.  However, 
> there are benefits to use inline docker build process.  The listed benefits 
> are:
> 
> 1.  Source code tag, maven repository artifacts and docker hub artifacts can 
> all be produced in one build.
> 2.  Less manual labor to tag different source branches.
> 3.  Reduce intermediate build caches that may exist in multi-stage builds.
> 4.  Release engineers and developers do not need to search a maze of build 
> flags to acquire artifacts.
> 
> The disadvantages are:
> 
> 1.  Require developer to have access to docker.
> 2.  Default build takes longer.
> 
> There is workaround for above disadvantages by using -DskipDocker flag to 
> avoid docker build completely or -pl !modulename to bypass subprojects.
> Hadoop development did not follow Maven best practice because a full Hadoop 
> build requires a number of profile and configuration parameters.  Some 
> evolutions are working against Maven design and require fork of separate 
> source trees for different subprojects and pom files.  Maven best practice 
> (https://dzone.com/articles/maven-profile-best-practices) has explained that 
> do not use profile to trigger different artifact builds because it will 
> introduce maven artifact naming conflicts on maven repository using this 
> pattern.  Maven offers flags to skip certain operations, such as -DskipTests 
> -Dmaven.javadoc.skip=true -pl or -DskipDocker.  It seems worthwhile to make 
> some corrections to follow best practice for Hadoop build.
> 
> Some developers have advocated for separate build process for docker images.  
> We need consensus on the direction that will work best for Hadoop development 
> community.  Hence, my questions are:
> 
> Do we want to have inline docker build process in maven?
> If yes, it would be developer’s responsibility to pass -DskipDocker flag to 
> skip docker.  Docker is mandatory for default build.
> If no, what is the release flow for docker images going to look like?
> 
> Thank you for your feedback.
> 
> Regards,
> Eric
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to