Re: [slurm-users] Upgrading a slurm on a cluster, 17.02 --> 18.08

Paul Edmon Tue, 25 Sep 2018 07:23:31 -0700

I use rpm's for our installs here. I usually pause all the jobs priorto the upgrade, then I follow the guide here:


https://slurm.schedmd.com/quickstart_admin.html

I haven't done the upgrade to 18.08 though yet, and so I haven't had tocontend with the automatic restart that seems to be the case with thenew rpm spec script (we went to 17.11 prior to the rpm spec reorg). Frankly I wish that they didn't do the automatic restart as I like tomanage that myself.

As Chris said though you definitely want to do the slurmdbd upgrade fromthe commandline. I've had it where when just restarting the service ittimes out and the database only gets partially update. In which case Ihad to restore from the mysqldump I had made and tried again. Alsohighly recommend doing mysqldumps prior to major version updates.



-Paul Edmon-


On 09/25/2018 09:54 AM, Baker D.J. wrote:

Thank you for your comments. I could potentially force the upgrade ofthe slurm and slurm-slumdbd rpms using something like:
rpm -Uvh --noscripts --nodeps --force slurm-18.08.0-1.el7.x86_64.rpmslurm-slurmdbd-18.08.0-1.el7.x86_64.rpm
That will certainly work, however the slurmctld (or in the case of mytest node, the slurmd) will be killed. The logic is that at v17.02 theslurm rpm provides slurmctld and slurmd. So upgrading that rpm willdestroy/kill the existing slurmctld or slurmd processes. That is...
# rpm -q --whatprovides /usr/sbin/slurmctld
slurm-17.02.8-1.el7.x86_64
So if I force the upgrade of that rpm then I delete and kill/usr/sbin/slurmctld. In the new rpm structure slurmctld is nowprovided by its own rpm.
I would have thought that someone would have crossed this bridge, butmaybe most admins don't use rpms...
Best regards,
David

------------------------------------------------------------------------
*From:* slurm-users <slurm-users-boun...@lists.schedmd.com> on behalfof Chris Samuel <ch...@csamuel.org>
*Sent:* 25 September 2018 13:00
*To:* slurm-users@lists.schedmd.com
*Subject:* Re: [slurm-users] Upgrading a slurm on a cluster, 17.02 -->18.08
On Tuesday, 25 September 2018 9:41:10 PM AEST Baker D. J.  wrote:

> I guess that the only solution is to upgrade all the slurm at once. That
> means that the slurmctld will be killed (unless it has been stoppedfirst).
We don't use RPMs from Slurm [1], but the rpm command does have a--noscriptsoption to (allegedly, I've never used it) suppress the execution ofpre/post
install scripts.
A big warning would be do not use systemctl to start the new slurmdbdfor the
first time when upgrading!
Stop the older one first (and then take a database dump) and then runthe newslurmdbd process with the "-Dvvv" options (inside screen, just incase) sothat you can watch its progress and systemd won't decide it's beentaking too
long to start and try and kill it part way through the database upgrades).
Once that's completed successfully then you can ^C it and start it upvia the
systemctl once more.

Hope that's useful!

All the best,
Chris
[1] - I've always installed into ${shared_local_area}/slurm/${version}and had
a symlink called "latest" that points at the currently blessed version of
Slurm. Then I stop slurmdbd, upgrade that as above, then I can doslurmctld
(with partitions marked down, just in case).  Once those are done I can
restart slurmd's around the cluster.

--
Chris Samuel :https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.csamuel.org%2F&data=01%7C01%7Cd.j.baker%40soton.ac.uk%7C35d2a0583f124e84bf0d08d622deab4e%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&sdata=uTVuUTGI3fpPZqffe1p5RifQ1%2BG%2FbsrW0ixkCeu%2FxKw%3D&reserved=0: Melbourne, VIC

Re: [slurm-users] Upgrading a slurm on a cluster, 17.02 --> 18.08

Reply via email to