Hello,

I'd like to set up chained builds. I understand chained builds as "multiple 
projects which depends on each other and where changes have been pushed to 
the same branch in each."

The typical case here is the master branch. People supply features. 
Eventually, they are merged to the master branch. Now, all downstream 
builds should run if they

1. have the same branch
2. depend on the same Maven version as the project just built

If I have a logical chain of projects A -> B -> C ("->" == "depends on") 
and "C" is modified, I want to build B and then A.

This includes:

- A should wait for the build of B since there us a chance that it might 
fail otherwise
- A should only build when it has a SNAPSHOT dependency on C. If I have a 
release 1.0 of C and 2.0-SNAPSHOT in the master branch but A depends on C 
1.0, no build is necessary (but wouldn't hurt)
- Builds should not rely on some global Maven repo.

Reasons for the last point:

A global Maven repo is very much like a global variable. Changes there 
always have side effects.

If a SNAPSHOT is pushed to a global Maven repo, everyone in the company 
will get this new version first thing in the morning (first Maven build of 
the day, when it updates SNAPSHOT dependencies). That can cause all kinds 
of weird problems. So I'm very reluctant to publish SNAPSHOT dependencies 
globally.

This becomes worse when feature branches are introduced. When several 
feature branches are built concurrently, no one can tell which version ends 
up in the global repo. When downstream builds start, they will randomly 
fail.

I haven't found a good solution for this last point.

I could create a new Maven repo when the build starts for C and then pass 
the path to B. If B gets a path to a Maven repo, it uses it; otherwise, it 
creates a new Maven repo. etc.
Problem: When the project builds on different nodes, this fails unless I 
use a network filesystem. Which adds brittleness plus I'm not 100% sure how 
Maven handles concurrent access to a local repo.

I could use Jenkins to pass the artifacts around but that means archiving 
them on Master and then downloading them on the client. Archiving is 
somewhat slow and a burden on the master node (especially when hundreds of 
projects build). But the main problem is to know what to download and how 
to get it into the local Maven repo. I guess I could look at the current 
job and find the upstream job and then just download all archived artifacts 
and try to install them. Not sure whether that would work.

A more serious problem is when I find a problem in B and push a new commit 
to the feature branch. Now the build starts with B. If I rely on C creating 
the repo for me, the build will fail because the new code from C is not 
visible anymore. B will download the last master branch from the global 
Maven repo and fail. If I use the "copy archived artifacts" approach, I 
have the same problem because there is no upstream "C" job anymore.

So I could create a local Maven repo using the branch name. That would help 
with feature branches but raise new issues: When can I safely delete those? 
If I delete too early (say, every night), starting a build with B will 
randomly fail again.
It would also mean that a lot of projects would eventually build into the 
repo with the name "master". There would be almost no way to clean that up. 
Maybe I can use the global Maven repo for "master" + "release" builds and 
local repos for everything else. Then people would have to remember that 
when debugging build problems.

One option would be to deny commits to master which contain SNAPSHOT 
dependencies (so the project itself could use a SNAPSHOT version but all 
the dependencies would have to be releases). That sounds like a good 
solution at first but in your case, "A" is a client specific project. For 
some products ("B"), we have 20+ clients. Most of them stay at the latest 
release build but a few are part of the next release. It would be a big 
overhead to force a release of the product every time a new feature is 
integrated into a client. Imagine having to do release builds 2-3 times a 
day. We would prefer to have one release build per release cycle of a 
product and keep the product and all involved clients at SNAPSHOT for the 
whole cycle because that would allow us to notice early when some feature 
for client X breaks client Y.

So my final design looks like this:

- Every project has one local Maven repo per branch (somehow shared between 
nodes)
- When a build of C succeeds, it triggers B. A waits. B copies the whole 
repo of C into it's own.
- When B has been built, all the A's copy B's repo into their own and build.

That would allow to start a chained build at any point.
If the Maven repos gets corrupted, we can delete them all and trigger a 
build of C to recreate them.

What I don't like here is the massive disk usage. Even for simple projects, 
Maven downloads 100-200 MB of code for its plugins. So 100 repos would need 
10 TB of disk space (10 projects with 10 feature branches). It's also 
somewhat slow but in my tests, copying 200 MB of Maven repo took < 10s, so 
it's bearable.

Also, I'm not sure how to solve "fragmented" chains when there is a feature 
branch for A and C but not for B. In this case, the build of C needs to 
trigger just A, skipping B. A may depend on the SNAPSHOT version of B from 
the master branch.

Any comments? Has someone already set up something like this? Does this 
work reliably? How can I solve the disk space issue?

Regards,

-- 
Aaron Digulla

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-users/ed355d9b-5d54-4d23-8c57-e9c7581acc2a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to