Re: [slurm-users] monitoring and accounting

2023-06-12 Thread Brian Andrus
Second that. Prometheus+slurm exporter+grafana works great. Brian Andrus On 6/12/2023 8:20 AM, Josef Dvoracek wrote: > But I'd be interested to see what other places do. we installed this: https://github.com/vpenso/prometheus-slurm-exporter and scrape this exporter with "inputs.prometheus" T

Re: [slurm-users] monitoring and accounting

2023-06-12 Thread Josef Dvoracek
> But I'd be interested to see what other places do. we installed this: https://github.com/vpenso/prometheus-slurm-exporter and scrape this exporter with "inputs.prometheus" Telegraf input and it's sent to influx (and shown by Grafana) -- josef On 12. 06. 23 1:43, Andrew Elwell wrote: ...

Re: [slurm-users] monitoring and accounting

2023-06-12 Thread Reed Dier
Hey Andrew, I don’t have any specific examples I can share right this second, I’ll look into making it shareable, but my solution was to throw some basic bash scripts into cron to scrap and ship into influx. I have one script that looks at sinfo, parsing out AIOT state for nodes and CPUs, and

Re: [slurm-users] monitoring and accounting

2023-06-12 Thread Ole Holm Nielsen
Hi Andrew, On 6/12/23 01:43, Andrew Elwell wrote: Are your slurm to influx scripts publicly available anywhere? I do something similar for squeue via python subprocess to call squeue -M all -a -o "%P,%a,%u,%D,%q,%T,%r" And some sinfo calls for node/cpu usage: sinfo -M {} -o "%P,%a,%F" sinfo

Re: [slurm-users] monitoring and accounting

2023-06-11 Thread Andrew Elwell
On Fri, 2 June 2023, 22:03 Jörg Striewski, wrote: > Hi, we use grafana with influx, it is easy to install and works fine > Hi Jörg, Are your slurm to influx scripts publicly available anywhere? I do something similar for squeue via python subprocess to call squeue -M all -a -o "%P,%a,%u,%D,%q,

Re: [slurm-users] monitoring and accounting

2023-06-02 Thread Jörg Striewski
t is not very responsive. > > > > If there a more recent feedback on any accounting tool ? > > > > Thanks in advance, > > Christine > > > > *De :* slurm-users *De la part de* > Davide DelVento > *Envoyé :* vendredi 5 mai 2023 15:19 > *À :* Slurm U

Re: [slurm-users] monitoring and accounting

2023-06-02 Thread LEROY Christine 208562
: vendredi 5 mai 2023 15:19 À : Slurm User Community List Objet : Re: [slurm-users] monitoring and accounting At a place I worked before, we used XDMOD several years ago. It was a bit tricky to set up correctly and not exactly intuitive to get started with data collection as a user (managers

Re: [slurm-users] monitoring and accounting

2023-05-05 Thread Brian Andrus
Something I have been impressed with is Netdata It is in the standard repositories and will auto-detect quite a bit of things on a node. It is great for real-time monitoring of a node/job. I also use Prometheus and Grafana for historic data (anything over 5 minutes). Brian Andrus On 5/5/20

Re: [slurm-users] monitoring and accounting

2023-05-05 Thread Davide DelVento
At a place I worked before, we used XDMOD several years ago. It was a bit tricky to set up correctly and not exactly intuitive to get started with data collection as a user (managers, allocation specialists and other not-super-technical people were most of our users). But when familiarized with it,

[slurm-users] monitoring and accounting

2023-05-05 Thread LEROY Christine 208562
Hello Everyone, We would like to improve our visibility on our cluster usage. We have ganglia, and use sacct actually, but I was wondering if there was a web tool recommended to have both monitoring and accounting (user and admin friendly) ? Thanks in advance Christine