Re: [slurm-users] monitoring and accounting

2023-06-12 Thread Brian Andrus
Second that. Prometheus+slurm exporter+grafana works great. Brian Andrus On 6/12/2023 8:20 AM, Josef Dvoracek wrote: > But I'd be interested to see what other places do. we installed this: https://github.com/vpenso/prometheus-slurm-exporter and scrape this exporter with "inputs.prometheus" T

Re: [slurm-users] monitoring and accounting

2023-06-12 Thread Josef Dvoracek
> But I'd be interested to see what other places do. we installed this: https://github.com/vpenso/prometheus-slurm-exporter and scrape this exporter with "inputs.prometheus" Telegraf input and it's sent to influx (and shown by Grafana) -- josef On 12. 06. 23 1:43, Andrew Elwell wrote: ...

Re: [slurm-users] monitoring and accounting

2023-06-12 Thread Reed Dier
Hey Andrew, I don’t have any specific examples I can share right this second, I’ll look into making it shareable, but my solution was to throw some basic bash scripts into cron to scrap and ship into influx. I have one script that looks at sinfo, parsing out AIOT state for nodes and CPUs, and

Re: [slurm-users] monitoring and accounting

2023-06-12 Thread Ole Holm Nielsen
Hi Andrew, On 6/12/23 01:43, Andrew Elwell wrote: Are your slurm to influx scripts publicly available anywhere? I do something similar for squeue via python subprocess to call squeue -M all -a -o "%P,%a,%u,%D,%q,%T,%r" And some sinfo calls for node/cpu usage: sinfo -M {} -o "%P,%a,%F" sinfo