Good evening everyone,

I've heard great things about Jenkins over the years, and finally decided 
to give it a serious test drive at work recently. My experiences have been 
massively underwhelming thus far, and I'm hoping this is because of a 
critical knowledge gap on my part regarding Jenkins' capabilities. I 
welcome any pointers, insight, and recommendations.

Our use case:

   - We need to build *22 different computational tools*.
   - Our binaries are collectively comprised of *281 individual libraries*.
   - We use *60 to 70 build machines* with hardware ranging from 2-core, 
   3GB of RAM to 64-core, 1TB of RAM.

We've logically divided up our build process into several stages and tasks:

   - *Stage 1: Prepare a node for building.* 1 task (let's call it *prepare*) 
   to sync code, set up a temporary directory to work in, and run 
   parsers/generators (lex, yacc, bison, etc.).
   - *Stage 2: Build a library.* 281 different tasks (let's call them 
   *build_lib1*, *build_lib2*, *build_lib3*, etc.) for 281 different 
   libraries.
   - *Stage 3: Build a binary.* 22 different tasks (let's call them 
   *build_binA*, *build_binB*, *build_binC*, etc.) for 22 different 
   binaries.

We've been able to hook Jenkins up to our SCM system without any issue, and 
Jenkins begins building within seconds after an engineer makes a commit. 
That part works like a charm. But as soon as we try to use Jenkins to 
support any kind of complex build workflow, we run into massive 
inefficiencies. Here's what we want to achieve, but I'm not certain if 
Jenkins even supports this kind of behavior:

   1. As soon as a commit is made, all available nodes enter into *Stage 1* by 
   running the *prepare* task. They sync, create directories, and run 
   generators. Our fastest nodes can complete this stage in about 50 
   seconds, while others take upwards of 10 minutes.
   2. As soon as a particular node finishes Stage 1, it should immediately 
   enter *Stage 2* and begin running library build tasks (*build_lib1*, 
   *build_lib2*, etc.). This stage can take several hours to complete. Each 
   node should continue performing library build tasks until all 281 have been 
   completed and there are no more tasks in the Stage 2 queue. When a lib is 
   finished building, it's copied to a network path visible to all other nodes
   .
   3. The instant after Stage 2 completes, all nodes should immediately 
   enter *Stage 3* and begin running binary build tasks (*build_binA*, 
   *build_binB*, etc.). When a binary is finished building, it is uploaded 
   to a network path that all other binaries from this run will be copied to.

As best as I can tell, Jenkins either does not support the above workflow, 
or requires such a complex combination of plugins that none of us at work 
have been able to get anything fully working. Here are the issues we're 
facing:

   1. *How do we tell Jenkins to execute the same job (prepare) on all 
   nodes?* We must be missing something very fundamental here because this 
   is what I would consider core functionality in any build system. We want to 
   be able to say "Run this job (*prepare*) on all nodes that match label 
   XYZ", but it seems like Jenkins only runs the job once, on some random node 
   in that label group. In order to get execution the same job simultaneously 
   across all nodes, we've had to resort to some pretty ugly hacks or weird 
   combinations of plugins, none of which are what I would call 
   production-ready.
   2. *How do we tell a Jenkins node to begin Stage 2 immediately after it 
   finishes Stage 1?* Once our fastest nodes finish Stage 1, they're ready 
   to begin building libraries, and do not need to wait for the slower nodes 
   to finish preparing themselves. Our 64-core machines are often ready in 
   about 50 seconds, whereas many of our 2-core and 4-core machines take 8 to 
   10 minutes to be ready. Our builds can take several hours to complete, and 
   if we have our entire farm sitting idle while we wait on our slowest nodes 
   to finish preparing themselves, we're wasting massive amounts of resources.
   3. How do we tell a Jenkins node to begin Stage 3 *immediately** after 
   all tasks in Stage 2 have completed?* Once tasks in Stage 2 complete, we 
   should be clear to begin the tasks in Stage 3. How do we ensure we're 
   properly gated here? There are some plugins out there that appear to do 
   just that, but they're either marked as deprecated or no longer supported.

So how does everyone here use Jenkins? I consider our use case to be 
ridiculously common among large projects, but it seems almost like Jenkins 
wasn't built for the above kind of workflow. We've now started looking at 
plugins to provide the above functionality, but installing a large number 
of different plugins - all with different documentation standards and 
development histories, some of which seem to be abandoned - just to get 
things working all seems so... fragile.

Are we doing this all wrong? Does Jenkins not support this kind of 
functionality out of the box?

I welcome your input and feedback,
-JJ

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to