On 5/5/22 16:08, Mark Dixon wrote:
On Thu, 5 May 2022, Ole Holm Nielsen wrote:
...
That is correct.  Just do "scontrol reconfig" on the slurmctld server.  If
all your slurmd's are truly running Configless[1], they will pick up the
new config and reconfigure without restarting.

Details are summarized in
https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#reconfiguration-of-slurm-conf.
Beware that you can't add or remove nodes without restarting.  Also,
changing certain slurm.conf parameters require restarting.
...

However...

Given that the normal recommendation for adding/removing nodes safely is to:

* stop slurmctld
* edit slurm.conf etc.
* restart the slurmd nodes to pick up new slurm.conf
* start slurmctld

I'm confused how this is supposed to be achieved in a configless setting, as slurmctld isn't running to distribute the updated files to slurmd.

You're right, probably the correct order for Configless must be:

* stop slurmctld
* edit slurm.conf etc.
* start slurmctld
* restart the slurmd nodes to pick up new slurm.conf

See also slides 29-34 in https://slurm.schedmd.com/SLUG21/Field_Notes_5.pdf from the Slurm publications site https://slurm.schedmd.com/publications.html

Less-Safe, but usually okay, procedure:
1. Change configs
2. Restart slurmctld
3. Restart all slurmd processes really quickly


/Ole


Reply via email to