You cannot change the nodelist without draining the system of running jobs
(terminating all slurmstepd) and restarting all slurmd and slurmctld.  This
is because slurm uses a bit mask to represent the nodelist, and slurm uses
a hierarchical overlay communication network. If all daemons don't have the
same idea of that network you can run into communication problems which can
cause nodes to be marked down, killing the jobs running upon them.

I think if you are not using message aggregation, you might be able to get
away with leaving jobs running and just restarting all slurmd and
slurmctld.  But the tricky thing is you'll need to quiesce a lot of the
rpcs on the system which can partially be done by marking partitions down,
but not completely.

If you are thinking of adding nodes, I think you should look at the future
state that nodes can take. I haven't played with this, but I suspect it
might buy you some flexibility.

On Oct 22, 2017 11:43, "JinSung Kang" <[email protected]> wrote:

> Hello,
>
> I am having trouble with adding new nodes into slurm cluster without
> killing the jobs that are currently running.
>
> Right now I
>
> 1. Update the slurm.conf and add a new node to it
> 2. Copy new slurm.conf to all the nodes,
> 3. Restart the slurmd on all nodes
> 4. Restart the slurmctld
>
> But when I restart slurmctld all the jobs that were currently running are
> requeued (Begin Time) as reason for not running. The new added node works
> perfectly fine.
>
> I've included the slurm.conf. I've also included slurmctld.log output when
> I'm trying to add the new node.
>
> Cheers,
>
> Jin
>

Reply via email to