[slurm-users] Re: Slurm Reporting Difference between sreport and sacct

Passant Hafez via slurm-users Fri, 23 May 2025 05:33:12 -0700

Hi Steen,

Thanks a lot, that certainly sorted out most of the discrepancies!


I'm still having some differences though for the sreport and saact output for 
certain accounts so was wondering if there's anything else I'm missing in how 
sreport calculates it (for sacct I use cputimeraw and sum it and convert to hrs)

All the best,
Passant
________________________________
From: Steen Lysgaard <s...@dtu.dk>
Sent: Thursday, May 22, 2025 9:15 AM
To: 'slurm-us...@schedmd.com' <slurm-us...@schedmd.com>; Passant Hafez 
<passant.ha...@glasgow.ac.uk>
Subject: Re: Slurm Reporting Difference between sreport and sacct

Hi Passant,

I've found that when using  sacct to track resource usage over specific time 
periods, it's helpful to include the --truncate option. Without it, jobs that 
started before the specified start time will have their entire runtime counted, 
including time outside the specified range. The --truncate option ensures that 
only the time within the defined period is included. Maybe this can explain 
some of the discrepancy you experience.

Best regards,
Steen


________________________________
From: Passant Hafez via slurm-users <slurm-users@lists.schedmd.com>
Sent: Wednesday, May 21, 2025 18:48
To: 'slurm-us...@schedmd.com' <slurm-us...@schedmd.com>
Subject: [slurm-users] Slurm Reporting Difference between sreport and sacct

Hi all,

I was wondering if someone can help explaining this discrepancy.

I have different values for project gpu consumption using sreport vs sacct (+ 
some calculations)

This is an example that shows this:

sreport -t hours -T gres/gpu cluster AccountUtilizationByuser start=2025-04-01 
end=2025-04-05 | grep project1234
gives 178
while
sacct -n -X --allusers --accounts=project1234 --start=2025-04-01 
--end=2025-04-05 -o elapsedraw,AllocTRES%80,user,partition

gives
    213480                                  
billing=128,cpu=128,gres/gpu=8,mem=1000G,node=2    gpuplus
    249507                                  
billing=128,cpu=128,gres/gpu=8,mem=1000G,node=2    gpuplus
     13908                                     
billing=64,cpu=64,gres/gpu=4,mem=500G,node=1    gpuplus
      9552                                     
billing=64,cpu=64,gres/gpu=4,mem=500G,node=1    gpuplus
         4                                     
billing=16,cpu=16,gres/gpu=1,mem=200G,node=1        gpu
        11                                     
billing=16,cpu=16,gres/gpu=1,mem=200G,node=1        gpu
...



I will not bore you with the full output and its calculation, but the first job 
alone consumed 213480 seconds/60/60 * 8 gpus that's 474.4 gpu hours which is 
way more than the 178 hrs reported by sreport


Any clue why these are inconsistent? or how sreport calculated the 178 value?



All the best,
Passant

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

[slurm-users] Re: Slurm Reporting Difference between sreport and sacct

Reply via email to