[slurm-users] Re: Print Slurm Stats on Login

2024-08-27 Thread Simon Andrews via slurm-users
theus going already a little less so): https://github.com/rivosinc/prometheus-slurm-exporter On Tue, Aug 20, 2024 at 12:40 AM Simon Andrews via slurm-users mailto:slurm-users@lists.schedmd.com>> wrote: Possibly a bit more elaborate than you want but I wrote a web based monitoring system for

[slurm-users] Re: Print Slurm Stats on Login

2024-08-20 Thread Simon Andrews via slurm-users
Possibly a bit more elaborate than you want but I wrote a web based monitoring system for our cluster. It mostly uses standard slurm commands for job monitoring, but I've also added storage monitoring which requires a separate cron job to run every night. It was written for our cluster, but pr

[slurm-users] Jobs being denied for GrpCpuLimit despite having enough resource

2024-03-14 Thread Simon Andrews via slurm-users
Our cluster has developed a strange intermittent behaviour where jobs are being put into a pending state because they aren't passing the AssocGrpCpuLimit, even though the user submitting has enough cpus for the job to run. For example: $ squeue -o "%.6i %.9P %.8j %.8u %.2t %.10M %.7m %.7c %.20R