[slurm-users] Limit nodes of a partition without managing users

2020-08-17 Thread Gerhard Strangar
Hello, I'm wondering if it's possible to have slurm 19 run two partitions (low and high prio) that share all the nodes and limit the high prio partition in number of nodes used simultaneously without requiring to manage the users in the database. Any ideas? Regards, Gerhard

Re: [slurm-users] Limit nodes of a partition without managing users

2020-08-17 Thread Brian Andrus
Most likely, but the specific approach depends on how you define what you want. For example, what if there are no jobs in high pri queue but many in low? Should all the low ones run? What should happen if they get started and use all the nodes and a high-pri request comes in (preemption policy

[slurm-users] How to throttle sinfo/squeue/scontrol show so they don't throttle slurmctld

2020-08-17 Thread Ransom, Geoffrey M.
Hello We are having performance issues with slurtmctld (delayed sinfo/squeue results, socket timeouts for multiple sbatch calls, jobs/nodes sitting in COMP state for an extended period of time). We just fully switch to Slurm from Univa and I think our problem is users putting a lot of "sco

Re: [slurm-users] How to throttle sinfo/squeue/scontrol show so they don't throttle slurmctld

2020-08-17 Thread Paul Edmon
We've seen this in our shop.  Our solutions have been: 1. User defer or max_rpc_cnt to slow down the scheduler so it can catch up with RPC's 2. Target specific chatty users and tell them to knock it off. sdiag is your friend for this.  We also repeatedly tell users not to ping the scheduler

Re: [slurm-users] [External] Limit nodes of a partition without managing users

2020-08-17 Thread Prentice Bisbal
Yes, you can do this using Slurm's QOS facility to limit the number of nodes used simultaneously, for the high-priority partition you can use the GrpTRES setting to limit how many nodes or CPUs a QOS can use. -- Prentice On 8/17/20 1:13 PM, Gerhard Strangar wrote: Hello, I'm wondering if it'

Re: [slurm-users] How to throttle sinfo/squeue/scontrol show so they don't throttle slurmctld

2020-08-17 Thread Steven Senator (slurm-dev-list)
The slurm scheduler only locks out user requests when specific data structures are locked due to modification, or potential modification. So, the most effective technique is to limit the time window when that will be happening by a combination of efficient traversal of the main scheduling loop (whe

Re: [slurm-users] [External] Limit nodes of a partition without managing users

2020-08-17 Thread Gerhard Strangar
Prentice Bisbal wrote: >> I'm wondering if it's possible to have slurm 19 run two partitions (low >> and high prio) that share all the nodes and limit the high prio >> partition in number of nodes used simultaneously without requiring to >> manage the users in the database. > Yes, you can do this u

Re: [slurm-users] Limit nodes of a partition without managing users

2020-08-17 Thread Gerhard Strangar
Brian Andrus wrote: > Most likely, but the specific approach depends on how you define what > you want. My idea was "high prio job is next unless are are too many of them". > For example, what if there are no jobs in high pri queue but many in > low? Should all the low ones run? Yes. > What s