[slurm-users] Re: Jobs not getting scheduled, no priority calculation, but still in queue?

2024-10-07 Thread Ole Holm Nielsen via slurm-users
On 10/7/24 12:28, Cutts, Tim wrote: I should be clear, the JobArrayTaskLimit isn’t the issue (the user’s submitted with %1, which is why we’re getting that).  What I don’t understand is why the jobs remaining in the queue have no priority at all associated with them.  It’s as though the schedul

[slurm-users] Re: Jobs not getting scheduled, no priority calculation, but still in queue?

2024-10-07 Thread Cutts, Tim via slurm-users
I should be clear, the JobArrayTaskLimit isn’t the issue (the user’s submitted with %1, which is why we’re getting that). What I don’t understand is why the jobs remaining in the queue have no priority at all associated with them. It’s as though the scheduler has forgotten the job array exists

[slurm-users] Randomly draining nodes

2024-10-07 Thread Nacereddine Laddaoui via slurm-users
Hello everyone, I’ve recently encountered an issue where some nodes in our cluster enter a drain state randomly, typically after completing long-running jobs. Below is the output from the |sinfo| command showing the reason *“Prolog error”* : |root@controller-node:~# sinfo -R REASON USER TIME

[slurm-users] Re: Jobs not getting scheduled, no priority calculation, but still in queue?

2024-10-07 Thread Ole Holm Nielsen via slurm-users
Hi Tim, On 10/7/24 11:13, Cutts, Tim via slurm-users wrote: Something odd is going on on our cluster.  User has a lot of pending jobs in a job array (a few thousand). squeue -u kmnx005 -r -t PD | head -5              JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)

[slurm-users] Jobs not getting scheduled, no priority calculation, but still in queue?

2024-10-07 Thread Cutts, Tim via slurm-users
Something odd is going on on our cluster. User has a lot of pending jobs in a job array (a few thousand). squeue -u kmnx005 -r -t PD | head -5 JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 3045324_875 core run_scp_ kmnx005 PD 0:00 1

[slurm-users] Re: SLURM GRES reservation not working properly on 24.05.1

2024-10-07 Thread Minulakshmi S via slurm-users
Would appreciate any leads on the above query. Thanks in advance. On Fri, 20 Sept 2024 at 14:31, Minulakshmi S wrote: > Hello, > > *Issue 1:* > I am using slurm version 24.05.1 , my slurmd has a single node where I > connect multiple gres by enabling the overscribe feature. > I am able to use th