I use rpm's for our installs here.  I usually pause all the jobs prior to the upgrade, then I follow the guide here:

https://slurm.schedmd.com/quickstart_admin.html


I haven't done the upgrade to 18.08 though yet, and so I haven't had to contend with the automatic restart that seems to be the case with the new rpm spec script (we went to 17.11 prior to the rpm spec reorg).  Frankly I wish that they didn't do the automatic restart as I like to manage that myself.


As Chris said though you definitely want to do the slurmdbd upgrade from the commandline.  I've had it where when just restarting the service it times out and the database only gets partially update.  In which case I had to restore from the mysqldump I had made and tried again.  Also highly recommend doing mysqldumps prior to major version updates.


-Paul Edmon-


On 09/25/2018 09:54 AM, Baker D.J. wrote:

Thank you for your comments. I could potentially force the upgrade of the slurm and slurm-slumdbd rpms using something like:


rpm -Uvh --noscripts --nodeps --force slurm-18.08.0-1.el7.x86_64.rpm slurm-slurmdbd-18.08.0-1.el7.x86_64.rpm

That will certainly work, however the slurmctld (or in the case of my test node, the slurmd) will be killed. The logic is that at v17.02 the slurm rpm provides slurmctld and slurmd. So upgrading that rpm will destroy/kill the existing slurmctld or slurmd processes. That is...

# rpm -q --whatprovides /usr/sbin/slurmctld
slurm-17.02.8-1.el7.x86_64

So if I force the upgrade of that rpm then I delete and kill /usr/sbin/slurmctld. In the new rpm structure slurmctld is now provided by its own rpm.

I would have thought that someone would have crossed this bridge, but maybe most admins don't use rpms...

Best regards,
David

------------------------------------------------------------------------
*From:* slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of Chris Samuel <ch...@csamuel.org>
*Sent:* 25 September 2018 13:00
*To:* slurm-users@lists.schedmd.com
*Subject:* Re: [slurm-users] Upgrading a slurm on a cluster, 17.02 --> 18.08
On Tuesday, 25 September 2018 9:41:10 PM AEST Baker D. J.  wrote:

> I guess that the only solution is to upgrade all the slurm at once. That
> means that the slurmctld will be killed (unless it has been stopped first).

We don't use RPMs from Slurm [1], but the rpm command does have a --noscripts option to (allegedly, I've never used it) suppress the execution of pre/post
install scripts.

A big warning would be do not use systemctl to start the new slurmdbd for the
first time when upgrading!

Stop the older one first (and then take a database dump) and then run the new slurmdbd process with the "-Dvvv" options (inside screen, just in case) so that you can watch its progress and systemd won't decide it's been taking too
long to start and try and kill it part way through the database upgrades).

Once that's completed successfully then you can ^C it and start it up via the
systemctl once more.

Hope that's useful!

All the best,
Chris

[1] - I've always installed into ${shared_local_area}/slurm/${version} and had
a symlink called "latest" that points at the currently blessed version of
Slurm.  Then I stop slurmdbd, upgrade that as above, then I can do slurmctld
(with partitions marked down, just in case).  Once those are done I can
restart slurmd's around the cluster.

--
 Chris Samuel  : https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.csamuel.org%2F&amp;data=01%7C01%7Cd.j.baker%40soton.ac.uk%7C35d2a0583f124e84bf0d08d622deab4e%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&amp;sdata=uTVuUTGI3fpPZqffe1p5RifQ1%2BG%2FbsrW0ixkCeu%2FxKw%3D&amp;reserved=0 :  Melbourne, VIC





Reply via email to