Do you have backfill scheduling [1] enabled? If so, what settings are in place?

And the lower-priority jobs will only be eligible for backfill if and only if 
they don’t delay the start of the higher priority jobs.

So what kind of resources and time does a given array job require? Odds are, 
they have a time request that conflicts with the scheduled start time for the 
high priority jobs.

[1] https://slurm.schedmd.com/sched_config.html#backfill

From: Long, Daniel S. <daniel.l...@gtri.gatech.edu>
Date: Tuesday, September 24, 2024 at 1:20 PM
To: Renfro, Michael <ren...@tntech.edu>, slurm-us...@schedmd.com 
<slurm-us...@schedmd.com>
Subject: Re: Jobs pending with reason "priority" but nodes are idle

External Email Warning

This email originated from outside the university. Please use caution when 
opening attachments, clicking links, or responding to requests.

________________________________
I experimented a bit and think I have figured out the problem but not the 
solution.

We use multifactor priority with the job account the primary factor. Right now 
one project has much higher priority due to a deadline. Those are the jobs that 
are pending with “Resources”. They cannot run on the idle nodes because they do 
not satisfy the resource requirements (don’t have GPUs). What I don’t 
understand is why slurm doesn’t schedule the lower priority jobs onto those 
nodes, since those jobs don’t require GPUs. It’s very unexpected behavior, to 
me. Is there an option somewhere I need to set?


From: "Renfro, Michael" <ren...@tntech.edu>
Date: Tuesday, September 24, 2024 at 1:54 PM
To: Daniel Long <daniel.l...@gtri.gatech.edu>, "slurm-us...@schedmd.com" 
<slurm-us...@schedmd.com>
Subject: Re: Jobs pending with reason "priority" but nodes are idle

In theory, if jobs are pending with “Priority”, one or more other jobs will be 
pending with “Resources”.

So a few questions:


  1.  What are the “Resources” jobs waiting on, resource-wise?
  2.  When are they scheduled to start?
  3.  Can your array jobs backfill into the idle resources and finish before 
the “Resources” jobs are scheduled to start?

From: Long, Daniel S. via slurm-users <slurm-users@lists.schedmd.com>
Date: Tuesday, September 24, 2024 at 11:47 AM
To: slurm-us...@schedmd.com <slurm-us...@schedmd.com>
Subject: [slurm-users] Jobs pending with reason "priority" but nodes are idle

External Email Warning

This email originated from outside the university. Please use caution when 
opening attachments, clicking links, or responding to requests.

________________________________
Hi,

On our cluster we have some jobs that are queued even though there are 
available nodes to run on. The listed reason is “priority” but that doesn’t 
really make sense to me. Slurm isn’t picking another job to run on those nodes; 
it’s just not running anything at all. We do have a quite heterogeneous 
cluster, but as far as I can tell the queued jobs aren’t requesting anything 
that would preclude them from running on the idle nodes. They are array jobs, 
if that makes a difference.

Thanks for any help you all can provide.
-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to