Hello Chris,

Thank you for your comments. The scontrol reboot command is now working as 
expected.

Best regards,
David

________________________________
From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of 
Christopher Samuel <ch...@csamuel.org>
Sent: 16 June 2020 18:16
To: slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com>
Subject: Re: [slurm-users] Nodes do not return to service after scontrol reboot

On 6/16/20 8:16 am, David Baker wrote:

> We are running Slurm v19.05.5 and I am experimenting with the *scontrol
> reboot * command. I find that compute nodes reboot, but they are not
> returned to service. Rather they remain down following the reboot..

How are you using "scontrol reboot" ?

We do:

scontrol reboot ASAP nextstate=resume reason=$REASON $NODE

Which works for us (and we have health checks in our epilog that can
trigger this for known issues like running low on unfragmented huge pages).

All the best,
Chris
--
   Chris Samuel  :  
https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.csamuel.org%2F&amp;data=01%7C01%7Cd.j.baker%40soton.ac.uk%7C6fa4d9db3b0e47f6a03308d812197d60%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=V9%2Fytt3ActVODtPjD%2FXAB2w5TvVhSJDYJ9%2B0xUmJRUU%3D&amp;reserved=0
  :  Berkeley, CA, USA

Reply via email to