Thanks for the details Christian. What is it about your 5-10 stubborn projects that makes them unique? I’ve been trying to think about the upgrade bugs I’ve come across, the biggest one I can think of was a problem a few years ago where if you saved a job that used Ant, it would forget which Ant configuration was selected and the job could no longer build.
The platform we’re using will allow us to snapshot and restore disk images, so that should do for backups. I’ll also keep a copy of job configurations in source control, for cases where we want to restore an individual job instead of the entire disk. I’ve also come around to Mike’s way of thinking – there’s not much of a need to limit ourselves to LTS releases if we can rollback a version upgrade easily. What I’m thinking of doing differently is running a new release + latest plugins through some comprehensive testing and internally blessing releases that pass. We’ll upgrade our central platform (that a lot of smaller teams or teams that only need a basic setup use), and allow teams using our service to have their own masters to upgrade when they want to. After a blessed release is installed we’ll run some shakedown tests to make sure everything is working, instead of the comprehensive suite. Any thoughts on that approach? From: Christian Willman [mailto:cewi...@gmail.com] Sent: Thursday, 6 March 2014 6:47 PM To: jenkinsci-users@googlegroups.com Cc: JOHNSTON, Rob Subject: Re: Strategies to disk-risk a Jenkins upgrade To start, our jobs are 99% sameness. Most of our codebase can be built using one standard job, running a well-defined list of Maven commands. Realistically only the job name, SCM URL, and maybe timer or downstream job list are variable. So the first step in mitigating risk is to group like build processes onto separate Jenkins masters. Then you can better test how a change will affect the instance as a whole. As far as what to test, over time I've found a list of 5-10 very stubborn projects which require significantly more babysitting than the other thousands of jobs I manage. Identifying some similar jobs and cloning them to a test region is a good next step. Don't forget to pin the projects to a known good SCM revision to create a valid test environment. I would highly recommend taking nightly backups of the main config.xml, and probably every other config file in the Jenkins home directory. It doesn't matter how you do it. VM snapshots are great, as Mike points out. If you're a Cloudbees customer, their backup plugin is great. The Backup and thinBackup plugins are fine, but make sure you're archiving them to a share drive. You don't want to rely on the server's filesystem as your recovery source. Personally, I use a Jenkins job to tar up the Jenkins home directory (with zero depth) and automatically deploy it into Nexus every night. It's free and works fine. We have 14 Jenkins master installations (probably 10K jobs) -- it's pretty big. Of course not all masters are created equal. Again, if you can split up your jobs into logically distinct Jenkins masters, then you can always roll out your changes first to low-volume, low-risk instances and finally to the high-volume, high-importance instances. Finally, we don't use a package manager. We simply have a task to grab a specified Jenkins war from the WWW, deploy it to all 14 Jenkins master installations, and symlink the actual war name back to a version-agnostic name. A basic init.d handles the rest. The cool thing is that -- assuming an instance is idle -- we can back out a bad Jenkins upgrade in about two minutes -- and that's measuring from the time we receive an emergency page to the time we've restarted Jenkins using the old war. It's worth thinking about. Please let me know if this helps, or if it spawns more questions. I love discussing this topic. On Tuesday, March 4, 2014 11:53:59 PM UTC-5, JOHNSTON, Rob wrote: Hi list My group run a large Jenkins instance that many other teams in our organisation depend on. We want to automate the Jenkins upgrade process which will include a measure of verification that the upgrade hasn’t broken anything. I’m thinking of only using LTS releases, backing up the main config.xml and the configs for each job, and running test jobs that use the plugins we’re supporting. My question is, what else are you doing to verify a Jenkins upgrade has been successful? What checks do you perform before letting an upgrade go into production? Thanks Rob Johnston ________________________________ This e-mail is sent by Suncorp Group Limited ABN 66 145 290 124 or one of its related entities "Suncorp". Suncorp may be contacted at Level 18, 36 Wickham Terrace, Brisbane or on 13 11 55 or at suncorp.com.au<http://suncorp.com.au>. The content of this e-mail is the view of the sender or stated author and does not necessarily reflect the view of Suncorp. The content, including attachments, is a confidential communication between Suncorp and the intended recipient. If you are not the intended recipient, any use, interference with, disclosure or copying of this e-mail, including attachments, is unauthorised and expressly prohibited. If you have received this e-mail in error please contact the sender immediately and delete the e-mail and any attachments from your system. ________________________________ This e-mail is sent by Suncorp Group Limited ABN 66 145 290 124 or one of its related entities "Suncorp". Suncorp may be contacted at Level 18, 36 Wickham Terrace, Brisbane or on 13 11 55 or at suncorp.com.au. The content of this e-mail is the view of the sender or stated author and does not necessarily reflect the view of Suncorp. The content, including attachments, is a confidential communication between Suncorp and the intended recipient. If you are not the intended recipient, any use, interference with, disclosure or copying of this e-mail, including attachments, is unauthorised and expressly prohibited. If you have received this e-mail in error please contact the sender immediately and delete the e-mail and any attachments from your system. -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.