On 03-02-2022 16:37, Nathan Smith wrote:
Yes, we are running slurmdbd. We could arrange enough downtime to do an
incremental upgrade of major versions as Brian Andrus suggested, at
least on the slurmctld and slurmdbd systems. The slurmds I would just do
a direct upgrade once the scheduler work was completed.
As Brian Andrus said, you must upgrade Slurm by at most 2 major
versions, and that includes slurmd's as well! Don't do a "direct
upgrade" of slurmd by more than 2 versions!
I recommend separate physical servers for slurmdbd and slurmctld. Then
you can upgrade slurmdbd without taking the cluster offline. It's OK
for slurmdbd to be down for many hours, since slurmctld caches the state
information in the meantime.
I've described the Slurm upgrade process in detail in my Wiki page:
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#upgrading-slurm
Since you start from 17.02, you have to be extremely cautious when
upgrading the database! See the Wiki page for details. Make sure to
test the database upgrade on a test server, using a database dump in
stead of the real slurmdbd server.
I hope this helps.
/Ole
*From:* slurm-users <slurm-users-boun...@lists.schedmd.com> *On Behalf
Of *Brian Haymore
*Sent:* Wednesday, February 2, 2022 1:51 PM
*To:* slurm-us...@schedmd.com; Slurm User Community List
<slurm-users@lists.schedmd.com>
*Subject:* [EXTERNAL] Re: [slurm-users] Upgrade from 17.02.11 to 21.08.2
and state information
Are you running slurmdbd in your current setup? If you are then the
upgrade path there might have additional considerations moving this far
in versions.
--
Brian D. Haymore
University of Utah
Center for High Performance Computing
155 South 1452 East RM 405
Salt Lake City, Ut 84112
Phone: 801-558-1150, Fax: 801-585-5366
http://bit.ly/1HO1N2C
<https://urldefense.com/v3/__http:/bit.ly/1HO1N2C__;!!Mi0JBg!eqxyactyQJqJ7Bwy-LEQT4WeJrmjDkqZxfwNtCBk_zliQifvEt1RQj4RYjUwe98$>
------------------------------------------------------------------------
*From:*slurm-users <slurm-users-boun...@lists.schedmd.com
<mailto:slurm-users-boun...@lists.schedmd.com>> on behalf of Nathan
Smith <smina...@ohsu.edu <mailto:smina...@ohsu.edu>>
*Sent:* Wednesday, February 2, 2022 2:38 PM
*To:* slurm-us...@schedmd.com <mailto:slurm-us...@schedmd.com>
<slurm-us...@schedmd.com <mailto:slurm-us...@schedmd.com>>
*Subject:* [slurm-users] Upgrade from 17.02.11 to 21.08.2 and state
information
The "Upgrades" section of the quick-start guide [0] warns:
> Slurm permits upgrades to a new major release from the past two major
> releases, which happen every nine months (e.g. 20.02.x or 20.11.x to
> 21.08.x) without loss of jobs or other state information. State
> information from older versions will not be recognized and will be
> discarded, resulting in loss of all running and pending jobs.
We are planning for an upgrade from 17.02.11 to 21.08.2. As a part of
our upgrade procedure we'd be bringing the scheduler to full stop, so
the loss of running and pending jobs would not be a concern. Is there
anything more to state information than running and pending jobs? For
example, would the JobID count revert to 1 in the case of such an
upgrade?
[0] https://slurm.schedmd.com/quickstart_admin.html#upgrade
<https://urldefense.com/v3/__https:/slurm.schedmd.com/quickstart_admin.html*upgrade__;Iw!!Mi0JBg!eqxyactyQJqJ7Bwy-LEQT4WeJrmjDkqZxfwNtCBk_zliQifvEt1RQj4RNExvAfw$>
--
Nathan Smith
Research Systems Engineer
Advanced Computing Center
Oregon Health & Science University