Re: [slurm-users] derived counters

2021-04-13 Thread Heckes, Frank
Hi all, many thanks for all hints. The link in the latest pointing points to an impressive switch-board. Cheers, -Frank From: slurm-users On Behalf Of Renfro, Michael Sent: Tuesday, 13 April 2021 19:25 To: Slurm User Community List Subject: Re: [slurm-users] derived counters I'll neve

Re: [slurm-users] [EXT] [Beginner, SLURM 20.11.2] Unable to allocate resources when specifying gres in srun or sbatch

2021-04-13 Thread Cristóbal Navarro
Hi Sean, Sorry for the delay, The problem got solved accidentally by restarting the slurm services on the head node. Maybe it was an unfortunate combination of changes done, for which I was assuming "scontrol reconfigure" would apply them all properly. Anyways, I will follow your advice and try ch

Re: [slurm-users] derived counters

2021-04-13 Thread Renfro, Michael
I'll never miss an opportunity to plug XDMoD for anyone who doesn't want to write custom analytics for every metric. I've managed to get a little bit into its API to extract current values for number of jobs completed and the number of CPU-hours provided, and insert those into a single slide pre

Re: [slurm-users] derived counters

2021-04-13 Thread Juergen Salk
* Heckes, Frank [210413 12:04]: > This result from a mgmt. - question. How long jobs have to wait (in s, min, > h, day) before they getting executed and > how many jobs are waiting (are queued) for each partition in a certain time > interval. > The first one is easy to find with sacct and sub

Re: [slurm-users] derived counters

2021-04-13 Thread Hadrian Djohari
Hi Frank, A way to get "how long jobs wait in the queue" is to import the data to XDMOD (https://open.xdmod.org/9.0/index.html). The nifty reporting tool has many features to make it easier for us to report out the cluster usage. Hadrian On Tue, Apr 13, 2021 at 8:08 AM Heckes, Frank wrote: > H

Re: [slurm-users] derived counters

2021-04-13 Thread Heckes, Frank
Hello Ole, > >> -Original Message- > >>>* (average) queue length for a certain partition > > I wonder what exactly does your question mean? Maybe the number of jobs or > CPUs in the Pending state? Maybe relative to the number of CPUs in the > partition? > This result from a mgmt. -