Re: [slurm-users] monitoring and accounting

2023-06-12 Thread Ole Holm Nielsen

Hi Andrew,

On 6/12/23 01:43, Andrew Elwell wrote:
Are your slurm to influx scripts publicly available anywhere? I do 
something similar for squeue via python subprocess to call


squeue -M all -a -o "%P,%a,%u,%D,%q,%T,%r"

And some sinfo calls for node/cpu usage:

sinfo -M {} -o "%P,%a,%F"
sinfo -M {} -o "%%R,%a,%C,%B,%z"

But I'd be interested to see what other places do. Perhaps some examples 
could be gathered for Ole's wiki?


I'd be happy to copy examples and links to documentation to the Wiki.  I 
guess this would be the best place?


https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_accounting/#other-accounting-report-tools

/Ole



Re: [slurm-users] monitoring and accounting

2023-06-12 Thread Reed Dier
Hey Andrew,

I don’t have any specific examples I can share right this second, I’ll look 
into making it shareable, but my solution was to throw some basic bash scripts 
into cron to scrap and ship into influx.

I have one script that looks at sinfo, parsing out AIOT state for nodes and 
CPUs, and then a very ugly, hacky sed/cut/awk to scrape GPU usage; as well as 
squeue to see jobs per state; both of these per partition and cluster.
I have another script that is basic sreport parsing for the tres/gres I care 
about, so that I can get a somewhat birdseye trend of utilization over time.

There’s likely to be something far, far better for this, but it was a quick and 
dirty solution to get something visible with existing tooling (Grafana/influx).

Reed

> On Jun 11, 2023, at 6:43 PM, Andrew Elwell  wrote:
> 
> On Fri, 2 June 2023, 22:03 Jörg Striewski,  > wrote:
> Hi, we use grafana with influx, it is easy to install and works fine
> 
> Hi Jörg,
> 
> Are your slurm to influx scripts publicly available anywhere? I do something 
> similar for squeue via python subprocess to call
> 
> squeue -M all -a -o "%P,%a,%u,%D,%q,%T,%r"
> 
> And some sinfo calls for node/cpu usage:
> 
> sinfo -M {} -o "%P,%a,%F"
> sinfo -M {} -o "%%R,%a,%C,%B,%z"
> 
> But I'd be interested to see what other places do. Perhaps some examples 
> could be gathered for Ole's wiki?
> 
> Andrew
> 



smime.p7s
Description: S/MIME cryptographic signature


Re: [slurm-users] monitoring and accounting

2023-06-12 Thread Josef Dvoracek

> But I'd be interested to see what other places do.

we installed this: https://github.com/vpenso/prometheus-slurm-exporter

and scrape this exporter with "inputs.prometheus" Telegraf input and 
it's sent to influx (and shown by Grafana)


--

josef

On 12. 06. 23 1:43, Andrew Elwell wrote:
...



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [slurm-users] monitoring and accounting

2023-06-12 Thread Brian Andrus

Second that.

Prometheus+slurm exporter+grafana works great.

Brian Andrus

On 6/12/2023 8:20 AM, Josef Dvoracek wrote:

> But I'd be interested to see what other places do.

we installed this: https://github.com/vpenso/prometheus-slurm-exporter

and scrape this exporter with "inputs.prometheus" Telegraf input and 
it's sent to influx (and shown by Grafana)


--

josef

On 12. 06. 23 1:43, Andrew Elwell wrote:
...