(sorry, kind of fell asleep on you there...) I wouldn't expect backfill to be a problem since it shouldn't be starting jobs that won't complete before the priority reservations start. We allow jobs to go over (overtimelimit) so in our case it can be a problem.
On one of our cloud clusters we had problems with large jobs getting starved so we set "assoc_limit_stop" in the scheduler parameters- I think for your config it would require removing "assoc_limit_continue" (we're on Slurm 18 and _continue is the default, replaced by _stop if you want that behavior). However, there we use the builtin scheduler- I'd imagine this would play heck with a fairshare/backfill cluster (like our on-campus) though. However, it is designed to prevent large-job starvation. We'd also had some issues with fairshare hitting the limit pretty quickly- basically it stopped being a useful factor in calculating priority- so we set FairShareDampeningFactor to 5 to get a little more utility out of that. I'd suggest looking at the output of sprio to see how your factors are working in situ, particularly when you've got a stuck large job. It may be that the SMALL_RELATIVE_TO_TIME could be washing out the job size factor if your larger jobs are also longer. HTH. M On Wed, Apr 10, 2019 at 2:46 AM David Baker <d.j.ba...@soton.ac.uk> wrote: > Michael, > > Thank you for your reply and your thoughts. These are the priority weights > that I have configured in the slurm.conf. > > PriorityWeightFairshare=1000000 > PriorityWeightAge=100000 > PriorityWeightPartition=1000 > PriorityWeightJobSize=10000000 > PriorityWeightQOS=10000 > > I've made the PWJobSize to be the highest factor, however I understand > that that only provides a once-off kick to jobs and so it probably > insignificant > in the longer run . That's followed by the PWFairshare. > > Should I really be looking at increasing the PWAge factor to help to "push > jobs" through the system? > > The other issue that might play a part is that we see a lot of single node > jobs (presumably backfilled) into the system. Users aren't excessively > bombing the cluster, but maybe some backfill throttling would be useful as > well (?) > > What are your thoughts having seen the priority factors, please? I've > attached a copy of the slurm.conf just in case you or anyone else wants to > take a more complete overview. > > Best regards, > David > > ------------------------------ > *From:* slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of > Michael Gutteridge <michael.gutteri...@gmail.com> > *Sent:* 09 April 2019 18:59 > *To:* Slurm User Community List > *Subject:* Re: [slurm-users] Effect of PriorityMaxAge on job throughput > > > It might be useful to include the various priority factors you've got > configured. The fact that adjusting PriorityMaxAge had a dramatic effect > suggests that the age factor is pretty high- might be worth looking at that > value relative to the other factors. > > Have you looked at PriorityWeightJobSize? Might have some utility if > you're finding large jobs getting short-shrift. > > - Michael > > > On Tue, Apr 9, 2019 at 2:01 AM David Baker <d.j.ba...@soton.ac.uk> wrote: > > Hello, > > I've finally got the job throughput/turnaround to be reasonable in our > cluster. Most of the time the job activity on the cluster sets the default > QOS to 32 nodes (there are 464 nodes in the default queue). Jobs requesting > nodes close to the QOS level (for example 22 nodes) are scheduled within 24 > hours which is better than it has been. Still I suspect there is room for > improvement. I note that these large jobs still struggle to be given a > starttime, however many jobs are now being given a starttime following my > SchedulerParameters makeover. > > I used advice from the mailing list and the Slurm high throughput document > to help me make changes to the scheduling parameters. They are now... > > > SchedulerParameters=assoc_limit_continue,batch_sched_delay=20,bf_continue,bf_interval=300,bf_min_age_reserve=10800,bf_window=3600,bf_resolution=600,bf_yield_interval=1000000,partition_job_depth=500,sched_max_job_start=200,sched_min_interval=2000000 > > Also.. > PriorityFavorSmall=NO > PriorityFlags=SMALL_RELATIVE_TO_TIME,ACCRUE_ALWAYS,FAIR_TREE > PriorityType=priority/multifactor > PriorityDecayHalfLife=7-0 > PriorityMaxAge=1-0 > > The most significant change was actually reducing "PriorityMaxAge" to 7-0 > to 1-0. Before that change the larger jobs could hang around in the queue > for days. Does it make sense therefore to further reduce PriorityMaxAge to > less than 1 day? Your advice would be appreciated, please. > > Best regards, > David > > > >