I should also note that scontrol reboot works fine, but srun/salloc/sbatch hang.

Michael Heinz
End-to-End Network Software Engineer
michael.he...@intel.com<mailto:michael.he...@intel.com>

From: Heinz, Michael
Sent: Thursday, June 8, 2023 9:00 AM
To: slurm-users@lists.schedmd.com
Subject: RE: [slurm-users] Can't get --reboot to work at all with slurm-23.02?

Yup. RebootProgram is set to /sbin/reboot on all machines. It still just sits 
there and does nothing. Nothing in the logs on the compute node, nothing in the 
slurmctld.log on the head node.

Michael Heinz
End-to-End Network Software Engineer
michael.he...@intel.com<mailto:michael.he...@intel.com>

From: Brian Andrus <toomuc...@gmail.com<mailto:toomuc...@gmail.com>>
Sent: Wednesday, June 7, 2023 12:10 PM
To: slurm-users@lists.schedmd.com<mailto:slurm-users@lists.schedmd.com>
Subject: Re: [slurm-users] Can't get --reboot to work at all with slurm-23.02?


Make sure you have configured the RebootProgram in slurm.conf, that it exists 
on the nodes and is executable by the user.

This is usually /sbin/reboot

Brian Andrus
On 6/7/2023 7:50 AM, Heinz, Michael wrote:
Hey, all.

So I added slurmdbd to our slurm-23.02 install and made my account an admin, 
but when I try to do a srun with --reboot it literally just sits forever, no 
errors, nothing in the logs, it just sits with the node in “CF” state until I 
cancel the job, set the node to down and back to idle again.

I tried setting RebootProgram to a script that just writes to a file in /tmp 
but the program never runs.

Any suggestions?

Michael Heinz
End-to-End Network Software Engineer
michael.he...@intel.com<mailto:michael.he...@intel.com>

Reply via email to