Hey Andrew, I don’t have any specific examples I can share right this second, I’ll look into making it shareable, but my solution was to throw some basic bash scripts into cron to scrap and ship into influx.
I have one script that looks at sinfo, parsing out AIOT state for nodes and CPUs, and then a very ugly, hacky sed/cut/awk to scrape GPU usage; as well as squeue to see jobs per state; both of these per partition and cluster. I have another script that is basic sreport parsing for the tres/gres I care about, so that I can get a somewhat birdseye trend of utilization over time. There’s likely to be something far, far better for this, but it was a quick and dirty solution to get something visible with existing tooling (Grafana/influx). Reed > On Jun 11, 2023, at 6:43 PM, Andrew Elwell <andrew.elw...@gmail.com> wrote: > > On Fri, 2 June 2023, 22:03 Jörg Striewski, <striew...@ismll.de > <mailto:striew...@ismll.de>> wrote: > Hi, we use grafana with influx, it is easy to install and works fine > > Hi Jörg, > > Are your slurm to influx scripts publicly available anywhere? I do something > similar for squeue via python subprocess to call > > squeue -M all -a -o "%P,%a,%u,%D,%q,%T,%r" > > And some sinfo calls for node/cpu usage: > > sinfo -M {} -o "%P,%a,%F" > sinfo -M {} -o "%%R,%a,%C,%B,%z" > > But I'd be interested to see what other places do. Perhaps some examples > could be gathered for Ole's wiki? > > Andrew >
smime.p7s
Description: S/MIME cryptographic signature