That should work. The upgrade though will have to wait until the dbd is
actually on a different server. We run the ctld and dbd on the same
machine here for the sake of performance. Before the rpm reorg we used
to upgrade only the dbd first and then the ctld but with the reorg I'm
taking a downtime for the dbd upgrade. That's not too bad though as we
pause all our jobs out of paranoia for upgrades.
-Paul Edmon-
On 3/1/19 8:10 AM, Ole Holm Nielsen wrote:
We're one of the many Slurm sites which run the slurmdbd database
daemon on the same server as the slurmctld daemon. This works without
problems at our site given our modest load, however, SchedMD
recommends to run the daemons on separate servers.
Contemplating how to upgrade our cluster from Slurm 17.11 to 18.08,
I've come to appreciate the advantage of running the daemons on
separate servers: One can upgrade slurmdbd to 18.08 while keeping
slurmctld at 17.11 (for a while at least). This enables us to upgrade
to 18.08 in the recommended order without any interruption to our
running jobs and without any cluster downtime.
I've been collecting various pieces of information about Slurm
upgrades and I've come up with a tested procedure for migrating the
slurmdbd service (on a CentOS/RHEL 7 system) to a new server:
https://wiki.fysik.dtu.dk/niflheim/Slurm_database#migrate-the-slurmdbd-service-to-another-server
The basic idea is that slurmctld continues happily while slurmdbd is
down, so you can migrate the MySQL database and slurmdbd behind the
scenes. When the new slurmdbd server is up and running, you
reconfigure slurm.conf on the cluster.
Upgrading slurmctld and slurmd is another topic, and this is discussed
in my Wiki page
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#upgrading-slurm.
I'd appreciate comments and suggestions about my procedure.
/Ole