Hi Bjørn-Helge,
On 3/10/25 08:50, Bjørn-Helge Mevik via slurm-users wrote:
The slurmctld can be restarted immediately after upgrading without
slurmdbd being available, and thereby your cluster will keep running
without any interruption of service. A little later you can enable
and start slurmdbd, and the delay of slurmdbd doesn't cause any
problems for slurmctld or the users. I emphasize that we're
discussing *minor release* upgrades only!
@Bjørn-Helge: Do you think there is good reason to start slurmdbd
before slurmctld when doing minor release upgrades?
Not any more, it appears. But (unless my memory fails me again)
earlier, slurmctld would refuse to start unless slurmdbd was running
(when it was configured to use slurmdbd). Slurmctld would be fine with
slurmdbd stopping while slurmctld was running, but upon start, it would
require slurmdbd to be already running.
That's an interesting observation! I haven't tried this, and it would be
worth testing.
My guess is that recent versions of Slurm would have no problem starting
slurmctld when slurmdbd is down. The logic change of "scontrol reconfig"
will restart slurmctld as introduced in 23.11 (or was it in 23.02?). See
"Fixing 'scontrol reconfigure'" in
https://slurm.schedmd.com/SLUG23/roadmap-slug23.pdf
Since our siter is already at 24.11.2, I wonder if someone can make the
test on older Slurm releases?
Thanks,
Ole
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com