subject:"\[slurm\-users\] Areas for improvement on our site's cluster scheduling"

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

2018-05-08 Thread Paul Edmon

We've been using a backfill priority partition for people doing HTC work. We have requeue set so that jobs from the high priority partitions can take over. You can do this for your interactive nodes as well if you want. We dedicate hardware to interactive work and use Partition based QoS's to

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

2018-05-08 Thread Renfro, Michael

That’s the first limit I placed on our cluster, and it has generally worked out well (never used a job limit). A single account can get 1000 CPU-days in whatever distribution they want. I’ve just added a root-only ‘expedited’ QOS for times when the cluster is mostly idle, but a few users have jo

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

2018-05-08 Thread Ole Holm Nielsen

On 05/08/2018 09:49 AM, John Hearns wrote: Actually what IS bad is users not putting cluster resources to good use. You can often see jobs which are 'stalled' - ie the nodes are reserved for the job, but the internal logic of the job has failed and the executables have not launched. Or maybe s

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

2018-05-08 Thread John Hearns

"Otherwise a user can have a sing le job that takes the entire cluster, and insidesplit it up the way he wants to." Yair, I agree. That is what I was referring to regardign interactive jobs. Perhaps not a user reserving the entire cluster, but a use reserving a lot of compute nodes and not making s

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

2018-05-08 Thread John Hearns

> Eventually the job aging makes the jobs so high-priority, Guess I should look in the manual, but could you increase the job ageing time parameters? I guess it is also worth saying that this is the scheduler doing its job - it is supposed to keep jobs ready and waiting to go, to keep the cluster

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

2018-05-08 Thread Yair Yarom

Hi, This is what we did, not sure those are the best solutions :) ## Queue stuffing We have set PriorityWeightAge several magnitudes lower than PriorityWeightFairshare, and we also have PriorityMaxAge set to cap of older jobs. As I see it, the fairshare is far more important than age. Besides t

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

2018-05-08 Thread Ole Holm Nielsen

On 05/08/2018 08:44 AM, Bjørn-Helge Mevik wrote: Jonathon A Anderson writes: ## Queue stuffing There is the bf_max_job_user SchedulerParameter, which is sort of the "poor man's MAXIJOB"; it limits the number of jobs from each user the backfiller will try to start on each run. It doesn't do

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

2018-05-07 Thread Bjørn-Helge Mevik

Jonathon A Anderson writes: > ## Queue stuffing There is the bf_max_job_user SchedulerParameter, which is sort of the "poor man's MAXIJOB"; it limits the number of jobs from each user the backfiller will try to start on each run. It doesn't do exactly what you want, but at least the backfiller

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

2018-05-07 Thread Ryan Novosielski

One of these TRES-related ones in a QOS ought to do it: https://slurm.schedmd.com/resource_limits.html Your problem there, though, is you will eventually have stuff waiting to run it and when the system is idle. We had the same circumstance and the same eventual outcome. -- || \\UTGERS,

[slurm-users] Areas for improvement on our site's cluster scheduling

2018-05-07 Thread Jonathon A Anderson

We have two main issues with our scheduling policy right now. The first is an issue that we call "queue stuffing." The second is an issue with interactive job availability. We aren't confused about why these issues exist, but we aren't sure the best way to address them. I'd love to hear any sug

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

Re: [slurm-users] Areas for improvement on our site's cluster scheduling

[slurm-users] Areas for improvement on our site's cluster scheduling

10 matches

Site Navigation

Mail list logo

Footer information