A colleague of mine has it scripted out quite well, so I can't speak to *all* of the details. However, we have a user that we submit our jobs as and it does the steps for upgrading (yum, dnf, etc). The jobs are wholenode/exclusive so nothing else can run there, and then a few other steps might be taken (node reboots etc). I think we might have some level of reservation in there so nodes can drain (which would help expedite the situation a bit but it still would depend on your longest running job).
This has worked well for . releases/patches and effectively behaves like a rolling upgrade. Yours might even be easier/quicker since it's symlinks (which is SchedMD's preferred method, iirc). Speaking of which, I believe one of the SchedMD folks gave some pointers on that in the past, perhaps in a presentation at SLUG. So you could peruse there, as well. On Thu, Sep 28, 2023 at 12:04 PM Groner, Rob <rug...@psu.edu> wrote: > > There's 14 steps to upgrading slurm listed on their website, including > shutting down and backing up the database. So far we've only updated slurm > during a downtime, and it's been a major version change, so we've taken all > the steps indicated. > > We now want to upgrade from 23.02.4 to 23.02.5. > > Our slurm builds end up in version named directories, and we tell > production which one to use via symlink. Changing the symlink will > automatically change it on our slurm controller node and all slurmd nodes. > > Is there an expedited, simple, slimmed down upgrade path to follow if > we're looking at just a . level upgrade? > > Rob > > -- David Rhey --------------- Advanced Research Computing University of Michigan