Hi David, David Baker <d.j.ba...@soton.ac.uk> writes: > Hello, > > Our SLURM cluster is relatively small. We have 350 standard compute > nodes each with 40 cores. The largest job that users can run on the > partition is one requesting 32 nodes. Our cluster is a general > university research resource and so there are many different sizes of > jobs ranging from single core jobs, that get routed to a serial > partition via the job-submit.lua, through to jobs requesting 32 > nodes. When we first started the service, 32 node jobs were typically > taking in the region of 2 days to schedule -- recently queuing times > have started to get out of hand. Our setup is essentially... > > PriorityFavorSmall=NO > FairShareDampeningFactor=5 > PriorityFlags=ACCRUE_ALWAYS,FAIR_TREE > PriorityType=priority/multifactor > PriorityDecayHalfLife=7-0 > > PriorityWeightAge=400000 > PriorityWeightPartition=1000 > PriorityWeightJobSize=500000 > PriorityWeightQOS=1000000 > PriorityMaxAge=7-0 > > To try to reduce the queuing times for our bigger jobs should we > potentially increase the PriorityWeightJobSize factor in the first > instance to bump up the priority of such jobs? Or should we > potentially define a set of QOSs which we assign to jobs in our > job_submit.lua depending on the size of the job. In other words, let's > say there is large QOS that give the largest jobs a higher priority, > and also limits how many of those jobs that a single user can submit? > > Your advice would be appreciated, please. At the moment these large > jobs are not accruing a sufficiently high priority to rise above the > other jobs in the cluster.
We have always gone for the weighting approach, rather than the QOS routing one. I have always thought that QOS routing potentially takes away some of the users' freedom unnecessarily. What if some one wants to submit a large number of 32-node jobs and is perfectly happy to wait a (long) while? We have QOSs with higher priorities, but with restricted MaxWall, MaxJobs, MaxSubmit, MaxTRESPU, and users have to request them explicitly. Cheers, Loris -- Dr. Loris Bennett (Mr.) ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de