[slurm-users] GraceTime is not working, But there is log.

2023-11-07 Thread 김형진
Hello ~ Please help me. Total GPU : 4 Large qos : 3 (max 3 gpus) Base qos : 2 (max 2 gpus) I have a total of four GPUs, and when a job with a large QoS is using three GPUs and a job with a base QoS is created, I want the large QoS job to wait for a certain period before the base QoS j

[slurm-users] Slurm release candidate version 23.11rc1 available for testing

2023-11-07 Thread Tim Wickberg
We are pleased to announce the availability of Slurm release candidate version 23.11.0rc1. To highlight some new features coming in 23.11: - Substantially overhauled the SlurmDBD association management code. For clusters updated to 23.11, account and user additions or removals are significant

Re: [slurm-users] Unable to submit job (ReqNodeNotAvail, UnavailableNodes)

2023-11-07 Thread JP Ebejer
On Tue, 7 Nov 2023 at 11:34, Diego Zuccato wrote: > Il 07/11/2023 11:15, JP Ebejer ha scritto: > > but on running sinfo > > right after, the node is still "drained". > > That's not normal :( > Look at the slurmd log on the node for a reason. Probably the node > detects an error and sets itself to

Re: [slurm-users] Unable to submit job (ReqNodeNotAvail, UnavailableNodes)

2023-11-07 Thread Diego Zuccato
Il 07/11/2023 11:15, JP Ebejer ha scritto: Hi there Diego, Grazie per il vostro aiuto. I had to use sudo to switch to the slurm user, as with myuser I got "slurm_update error: Invalid user id". Ok, that's normal. $ sudo -u slurm scontrol update nodename=compute-0 state=resume This works (

Re: [slurm-users] Unable to submit job (ReqNodeNotAvail, UnavailableNodes)

2023-11-07 Thread JP Ebejer
Hi there Diego, Grazie per il vostro aiuto. I had to use sudo to switch to the slurm user, as with myuser I got "slurm_update error: Invalid user id". $ sudo -u slurm scontrol update nodename=compute-0 state=resume This works (I think, as it returns no visual cue), but on running sinfo right af

Re: [slurm-users] Unable to submit job (ReqNodeNotAvail, UnavailableNodes)

2023-11-07 Thread Diego Zuccato
Il 07/11/2023 10:12, JP Ebejer ha scritto: sinfo shows that the node is drained (but this node is idle and has no processing) $ sinfo --Node --long Tue Nov 07 08:29:51 2023 NODELIST   NODES  PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON compute-0        1 all_node

[slurm-users] Unable to submit job (ReqNodeNotAvail, UnavailableNodes)

2023-11-07 Thread JP Ebejer
Hi there, First of all, apologies for the rather verbose email. Newbie here, wanting to set up a minimal slurm cluster on Debian 12. I installed slurm-wlm (22.05.8) on the head node and slurmd (also 22.05.8) on the compute node via apt. I have one head, one compute node, and one partition. I ha