Thanks for the suggestions, everyone. I will certainly be looking at SchedulerParameters to see if I can optimize that a little further with contemporary guidance on best practices.
It looks like what got us going well enough, for now, was just to bump the Priority associated with each QoS class this application was using to submit jobs by a factor of 10. Was using 1,000 previously, now using 10,000. After doing this we're seeing many more jobs get run concurrently and a much higher utilization rate. Best, Sean On Wed, Oct 25, 2017 at 11:17 AM, Patrick Goetz <[email protected]> wrote: > > On 10/25/2017 06:58 AM, Ole Holm Nielsen wrote: > >> >> I agree that the backfill scheduler requires configuration beyond the >> default settings! This surprised me as well. I wrote some notes in my >> Wiki which could be used as a starting point: >> https://wiki.fysik.dtu.dk/niflheim/Slurm_scheduler#backfill-scheduler >> >> > It would be nice if all this user generated documentation could make its > way into a centralized wiki. > >
