Re: [slurm-users] Can't get --reboot to work at all with slurm-23.02?

2023-06-08 Thread Heinz, Michael
Yup. RebootProgram is set to /sbin/reboot on all machines. It still just sits 
there and does nothing. Nothing in the logs on the compute node, nothing in the 
slurmctld.log on the head node.

Michael Heinz
End-to-End Network Software Engineer
michael.he...@intel.com

From: Brian Andrus 
Sent: Wednesday, June 7, 2023 12:10 PM
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] Can't get --reboot to work at all with slurm-23.02?


Make sure you have configured the RebootProgram in slurm.conf, that it exists 
on the nodes and is executable by the user.

This is usually /sbin/reboot

Brian Andrus
On 6/7/2023 7:50 AM, Heinz, Michael wrote:
Hey, all.

So I added slurmdbd to our slurm-23.02 install and made my account an admin, 
but when I try to do a srun with --reboot it literally just sits forever, no 
errors, nothing in the logs, it just sits with the node in “CF” state until I 
cancel the job, set the node to down and back to idle again.

I tried setting RebootProgram to a script that just writes to a file in /tmp 
but the program never runs.

Any suggestions?

Michael Heinz
End-to-End Network Software Engineer
michael.he...@intel.com



Re: [slurm-users] Can't get --reboot to work at all with slurm-23.02?

2023-06-08 Thread Heinz, Michael
I should also note that scontrol reboot works fine, but srun/salloc/sbatch hang.

Michael Heinz
End-to-End Network Software Engineer
michael.he...@intel.com

From: Heinz, Michael
Sent: Thursday, June 8, 2023 9:00 AM
To: slurm-users@lists.schedmd.com
Subject: RE: [slurm-users] Can't get --reboot to work at all with slurm-23.02?

Yup. RebootProgram is set to /sbin/reboot on all machines. It still just sits 
there and does nothing. Nothing in the logs on the compute node, nothing in the 
slurmctld.log on the head node.

Michael Heinz
End-to-End Network Software Engineer
michael.he...@intel.com

From: Brian Andrus mailto:toomuc...@gmail.com>>
Sent: Wednesday, June 7, 2023 12:10 PM
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] Can't get --reboot to work at all with slurm-23.02?


Make sure you have configured the RebootProgram in slurm.conf, that it exists 
on the nodes and is executable by the user.

This is usually /sbin/reboot

Brian Andrus
On 6/7/2023 7:50 AM, Heinz, Michael wrote:
Hey, all.

So I added slurmdbd to our slurm-23.02 install and made my account an admin, 
but when I try to do a srun with --reboot it literally just sits forever, no 
errors, nothing in the logs, it just sits with the node in “CF” state until I 
cancel the job, set the node to down and back to idle again.

I tried setting RebootProgram to a script that just writes to a file in /tmp 
but the program never runs.

Any suggestions?

Michael Heinz
End-to-End Network Software Engineer
michael.he...@intel.com



[slurm-users] slurm restapi and multi-cluster

2023-06-08 Thread mohammed shambakey
Hi

Is it possible to connect slurm restapi queries to a
multi-cluster/federation? I guess each request uses one (and only one) JWT,
so it is not possible to do it, right?

Regards

-- 
Mohammed