To start, our jobs are 99% sameness. Most of our codebase can be built using one standard job, running a well-defined list of Maven commands. Realistically only the job name, SCM URL, and maybe timer or downstream job list are variable. So the first step in mitigating risk is to group like build processes onto separate Jenkins masters. Then you can better test how a change will affect the instance as a whole.
As far as what to test, over time I've found a list of 5-10 very stubborn projects which require significantly more babysitting than the other thousands of jobs I manage. Identifying some similar jobs and cloning them to a test region is a good next step. Don't forget to pin the projects to a known good SCM revision to create a valid test environment. I would highly recommend taking nightly backups of the main config.xml, and probably every other config file in the Jenkins home directory. It doesn't matter how you do it. VM snapshots are great, as Mike points out. If you're a Cloudbees customer, their backup plugin is great. The Backup and thinBackup plugins are fine, but make sure you're archiving them to a share drive. You don't want to rely on the server's filesystem as your recovery source. Personally, I use a Jenkins job to tar up the Jenkins home directory (with zero depth) and automatically deploy it into Nexus every night. It's free and works fine. We have 14 Jenkins master installations (probably 10K jobs) -- it's pretty big. Of course not all masters are created equal. Again, if you can split up your jobs into logically distinct Jenkins masters, then you can always roll out your changes first to low-volume, low-risk instances and finally to the high-volume, high-importance instances. Finally, we don't use a package manager. We simply have a task to grab a specified Jenkins war from the WWW, deploy it to all 14 Jenkins master installations, and symlink the actual war name back to a version-agnostic name. A basic init.d handles the rest. The cool thing is that -- assuming an instance is idle -- we can back out a bad Jenkins upgrade in about two minutes -- and that's measuring from the time we receive an emergency page to the time we've restarted Jenkins using the old war. It's worth thinking about. Please let me know if this helps, or if it spawns more questions. I love discussing this topic. On Tuesday, March 4, 2014 11:53:59 PM UTC-5, JOHNSTON, Rob wrote: > > Hi list > > > > My group run a large Jenkins instance that many other teams in our > organisation depend on. We want to automate the Jenkins upgrade process > which will include a measure of verification that the upgrade hasn’t broken > anything. > > > > I’m thinking of only using LTS releases, backing up the main config.xml > and the configs for each job, and running test jobs that use the plugins > we’re supporting. > > > > My question is, what else are you doing to verify a Jenkins upgrade has > been successful? What checks do you perform before letting an upgrade go > into production? > > > > Thanks > > > > Rob Johnston > > ------------------------------ > > This e-mail is sent by Suncorp Group Limited ABN 66 145 290 124 or one of > its related entities "Suncorp". > Suncorp may be contacted at Level 18, 36 Wickham Terrace, Brisbane or on > 13 11 55 or at suncorp.com.au. > The content of this e-mail is the view of the sender or stated author and > does not necessarily reflect the view of Suncorp. The content, including > attachments, is a confidential communication between Suncorp and the > intended recipient. If you are not the intended recipient, any use, > interference with, disclosure or copying of this e-mail, including > attachments, is unauthorised and expressly prohibited. If you have received > this e-mail in error please contact the sender immediately and delete the > e-mail and any attachments from your system. > -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.