Hello, I'm hoping someone can offer some suggestions.
I went ahead started the database from scratch and reinitialized it to see if that would help and to try and understand how RawUsage is calculated. I ran two jobs of sbatch --account=luchko_group --wrap="sleep 60" -p cpu -n 100 With the partition defined as PriorityFlags=MAX_TRES PartitionName=cpu Nodes=node[1-7] MaxCPUsPerNode=182 MaxTime=7-0:00:00 State=UP TRESBillingWeights="CPU=1.0,MEM=0.125G,GRES/gpu=9.6" I expected each job to contribute 6000 to the RawUsage, however one job contributed 3100 and the other 2800. And TRESRunMins stayed at 0 for all categories. I'm at a loss as to what is going on. Thank you, Tyler Sent with [Proton Mail](https://proton.me/mail/home) secure email. On Tuesday, September 10th, 2024 at 9:03 PM, tluchko <tluc...@protonmail.com> wrote: > Hello, > > We have a new cluster and I'm trying to setup fairshare accounting. I'm > trying to track CPU, MEM and GPU. It seems that billing for individual jobs > is correct, but billing isn't being accumulated (TRESRunMin is always 0). > > In my slurm.conf, I think the relevant lines are > > AccountingStorageType=accounting_storage/slurmdbd > AccountingStorageTRES=gres/gpu > PriorityFlags=MAX_TRES > > PartitionName=gpu Nodes=node[1-7] MaxCPUsPerNode=384 MaxTime=7-0:00:00 > State=UP TRESBillingWeights="CPU=1.0,MEM=0.125G,GRES/gpu=9.6" > PartitionName=cpu Nodes=node[1-7] MaxCPUsPerNode=182 MaxTime=7-0:00:00 > State=UP TRESBillingWeights="CPU=1.0,MEM=0.125G,GRES/gpu=9.6" > I currently have one recently finished job and one running job. sacct gives > > $ sacct > --format=JobID,JobName,ReqTRES%50,AllocTRES%50,TRESUsageInAve%50,TRESUsageInMax%50 > JobID JobName ReqTRES AllocTRES TRESUsageInAve TRESUsageInMax > ------------ ---------- -------------------------------------------------- > -------------------------------------------------- > -------------------------------------------------- > -------------------------------------------------- > 154 interacti+ billing=9,cpu=1,gres/gpu=1,mem=1G,node=1 > billing=9,cpu=2,gres/gpu=1,mem=2G,node=1 > 154.interac+ interacti+ cpu=2,gres/gpu=1,mem=2G,node=1 > cpu=00:00:00,energy=0,fs/disk=2480503,mem=3M,page+ > cpu=00:00:00,energy=0,fs/disk=2480503,mem=3M,page+ > 155 interacti+ billing=9,cpu=1,gres/gpu=1,mem=1G,node=1 > billing=9,cpu=2,gres/gpu=1,mem=2G,node=1155.interac+ interacti+ > cpu=2,gres/gpu=1,mem=2G,node=1 > > billing=9 seems correct to me, since I have 1 GPU allocated, which has the > largest score of 9.6. However, sshare doesn't show anything in TRESRunMins > > sshare > --format=Account,User,RawShares,FairShare,RawUsage,EffectvUsage,TRESRunMins%110 > Account User RawShares FairShare RawUsage EffectvUsage TRESRunMins > -------------------- ---------- ---------- ---------- ----------- > ------------- > -------------------------------------------------------------------------------------------------------------- > root 21589714 1.000000 > cpu=0,mem=0,energy=0,node=0,billing=0,fs/disk=0,vmem=0,pages=0,gres/gpu=0,gres/gpumem=0,gres/gpuutil=0 > abrol_group 2000 0 0.000000 > cpu=0,mem=0,energy=0,node=0,billing=0,fs/disk=0,vmem=0,pages=0,gres/gpu=0,gres/gpumem=0,gres/gpuutil=0 > luchko_group 2000 21589714 1.000000 > cpu=0,mem=0,energy=0,node=0,billing=0,fs/disk=0,vmem=0,pages=0,gres/gpu=0,gres/gpumem=0,gres/gpuutil=0 > luchko_group tluchko 1 0.333333 21589714 1.000000 > cpu=0,mem=0,energy=0,node=0,billing=0,fs/disk=0,vmem=0,pages=0,gres/gpu=0,gres/gpumem=0,gres/gpuutil=0 > > Why is TRESRunMin all 0 but RawUsage is not for tluchko? I have checked and > slurmdbd is running. > > Thank you, > > Tyler > > Sent with [Proton Mail](https://proton.me/) secure email.
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com