I am experimenting with getting information from a Slurm cluster on how many resources each user has been consuming. I would like to get accumulated amount of CPU and GPU time over specified periods. GPU resources reported by type of GPU would be extra helpful. I am currently looking at sacct where I try options like:
sacct -a --starttime=2023-03-21T00:00 -- format="user,totalcpu,tresusageintot%100" "tresusageintot" shows me: "cpu=00:00:20,energy=0,fs/disk=0,mem=0,pages=3465,vmem=285140K ", so GPU information does not seem to be included and I have found no other option that can tell me. Also, it shows me individual job steps which I would really just like to aggregate. In fact I would just like to aggregate per user and ignore individual jobs. I have also tried `sreport`, but I cannot really get anything useful out of it at the user level. For example: sreport user TopUsage ----------------------------------------------------------------------- --------- Top 10 Users 2023-03-21T00:00:00 - 2023-03-21T23:59:59 (86400 secs) Usage reported in CPU Minutes ----------------------------------------------------------------------- --------- Cluster Login Proper Name Account Used Energy --------- --------- --------------- --------------- --------- -------- It just gives me an empty table with no user information. I am guessing something is not configured right here to be storing that data. I have "AccountingStorageTRES=gres/gpu" in slurm.conf. I am not sure what more I should perhaps put here. I hope someone can advise on what I am missing here and how I can best get the usage stats I am hoping for. Best regards, Thomas -- Special Consultant | CLAAUDIA Phone: (+45) 9940 9844 | Email: t...@its.aau.dk | Web: https://www.claaudia.aau.dk/ Aalborg University | Fredrik Bajers Vej 1, A1.65, 9220 Aalborg Ø, Denmark