Re: [slurm-users] Longer queuing times for larger jobs

2020-02-12 Thread Chris Samuel
On 5/2/20 1:44 pm, Antony Cleave wrote: Hi, from what you are describing it sounds like jobs are backfilling in front and stopping the large jobs from starting We use a feature that SchedMD implemented for us called "bf_min_prio_reserve" which lets you set a priority threshold below which Sl

Re: [slurm-users] Longer queuing times for larger jobs

2020-02-12 Thread Loris Bennett
Loris Bennett writes: > Hello David, > > David Baker writes: > >> Hello, >> >> I've taken a very good look at our cluster, however as yet not made >> any significant changes. The one change that I did make was to >> increase the "jobsizeweight". That's now our dominant parameter and it >> does e

[slurm-users] Slurm version 20.02.0rc1 is now available

2020-02-12 Thread Tim Wickberg
We are pleased to announce the availability of Slurm release candidate 20.02.0rc1. This is the first release candidate for the upcoming 20.02 release series, and marks the finalization of the RPC and state file formats. This rc1 also includes the first version of the Slurm REST API, as imple

[slurm-users] Advice on using GrpTRESRunMin=cpu=

2020-02-12 Thread David Baker
Hello, Before implementing "GrpTRESRunMin=cpu=limit" on our production cluster I'm doing some tests on the development cluster. I've only get a handful of compute nodes to play without and so I have set the limit sensibly low. That is, I've set the limit to be 576,000. That's equivalent to 400

Re: [slurm-users] Using "Nodes" on script - file ????

2020-02-12 Thread Renfro, Michael
Hey, Matthias. I’m having to translate a bit, so if I get a meaning wrong, please correct me. You should be able to set the minimum and maximum number of nodes used for jobs on a per-partition basis, or to set a default for all partitions. My most commonly used partition has: PartitionName=b

[slurm-users] Using "Nodes" on script - file ????

2020-02-12 Thread Matthias Krawutschke
Hello together, I have a special question regarding the variables: #SBATCH -nodes = 2 or SRUN -N…. Some users of the HPC set these WORTH very high and allocate the ComputeNode, although with that they do not require this at all. My question is the following now: Is it really neces

[slurm-users] Increasing the OpenFile under SLURM ....

2020-02-12 Thread Matthias Krawutschke
Hello together, I have a special question regarding to set up the limit of max. open files under Linux. It is in case of this about the value of the soft - files, not about the value for the hard files (see: ulimit-Sn) Some users of the HPC use a large number of files for the processing in

Re: [slurm-users] Node appears to have a different slurm.conf than the slurmctld; update_node: node reason set to: Kill task failed

2020-02-12 Thread Taras Shapovalov
Hey Robert, Ask Bright support, they will help you to figure out what is going on there. Best regards, Taras On Tue, Feb 11, 2020 at 8:26 PM Robert Kudyba wrote: > This is still happening. Nodes are being drained after a kill task failed. > Could this be related to https://bugs.schedmd.com/sho