On 5/5/22 16:08, Mark Dixon wrote:
On Thu, 5 May 2022, Ole Holm Nielsen wrote:
...
That is correct. Just do "scontrol reconfig" on the slurmctld server. If
all your slurmd's are truly running Configless[1], they will pick up the
new config and reconfigure without restarting.
Details are summarized in
https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#reconfiguration-of-slurm-conf.
Beware that you can't add or remove nodes without restarting. Also,
changing certain slurm.conf parameters require restarting.
...
However...
Given that the normal recommendation for adding/removing nodes safely is to:
* stop slurmctld
* edit slurm.conf etc.
* restart the slurmd nodes to pick up new slurm.conf
* start slurmctld
I'm confused how this is supposed to be achieved in a configless setting,
as slurmctld isn't running to distribute the updated files to slurmd.
You're right, probably the correct order for Configless must be:
* stop slurmctld
* edit slurm.conf etc.
* start slurmctld
* restart the slurmd nodes to pick up new slurm.conf
See also slides 29-34 in
https://slurm.schedmd.com/SLUG21/Field_Notes_5.pdf from the Slurm
publications site https://slurm.schedmd.com/publications.html
Less-Safe, but usually okay, procedure:
1. Change configs
2. Restart slurmctld
3. Restart all slurmd processes really quickly
/Ole