[slurm-users] Can't get --reboot to work at all with slurm-23.02?

2023-06-07 Thread Heinz, Michael
Hey, all.

So I added slurmdbd to our slurm-23.02 install and made my account an admin, 
but when I try to do a srun with --reboot it literally just sits forever, no 
errors, nothing in the logs, it just sits with the node in "CF" state until I 
cancel the job, set the node to down and back to idle again.

I tried setting RebootProgram to a script that just writes to a file in /tmp 
but the program never runs.

Any suggestions?

Michael Heinz
End-to-End Network Software Engineer
michael.he...@intel.com



Re: [slurm-users] Can't get --reboot to work at all with slurm-23.02?

2023-06-07 Thread Brian Andrus
Make sure you have configured the RebootProgram in slurm.conf, that it 
exists on the nodes and is executable by the user.


This is usually /sbin/reboot

Brian Andrus

On 6/7/2023 7:50 AM, Heinz, Michael wrote:


Hey, all.

So I added slurmdbd to our slurm-23.02 install and made my account an 
admin, but when I try to do a srun with --reboot it literally just 
sits forever, no errors, nothing in the logs, it just sits with the 
node in “CF” state until I cancel the job, set the node to down and 
back to idle again.


I tried setting RebootProgram to a script that just writes to a file 
in /tmp but the program never runs.


Any suggestions?

Michael Heinz

End-to-End Network Software Engineer

michael.he...@intel.com