Hi Alistair,
I was holding off replying in the hope someone would have a good answer. In
lieu of that, here’s my partial answer:
When I looked at trying to report per-user and per-group qos values a few
months I discovered that SLURM reports the information via this command:
scontrol -o show assoc_mgr flags=qos
I haven’t found any documentation explaining the format of that output. It
seems to be parsable, but I’m not sure if the format will change in later
version of SLURM. I’m using perl regexp’s for the reporting I’m doing, but
here’s a grep-based example to extract per-group limits of cpus which works on
my setup:
scontrol -o show assoc_mgr flags=qos|grep QOS=dept1|grep -o 'GrpTRES=[^
]*'|grep -o 'cpu=[0-9]*'
That information is available to all SLURM users. But given the different
contexts a qos can be used in, I’m not sure how you might be able to limit
reporting only to users who are permitted to use a specific qos.
And for completeness, here’s a similar method for extracting per-user gpu
limits:
scontrol -o show assoc_mgr flags=qos|grep QOS=myqosname|grep -o
'myusernamel([0-9]*)={[^}]*}'|grep -o 'MaxTRESPU=[^ ]*'|grep -o
'gres/gpu=[0-9]*([0-9]*)'
Regards,
Mike
From: Alastair Neil via slurm-users <[email protected]>
Sent: Tuesday, February 6, 2024 11:30 PM
To: [email protected]
Subject: [External] [slurm-users] Is there a way to list allocated/unallocated
resources defined in a QoS?
This email originated outside the University. Check before clicking links or
attachments.
Slurm version 23.02.07
If I have a QoS defined that has a set number of say GPU devices set in the
GrpTRES. Is there an easy way to generate a list of how much of the defined
quota is allocated or conversely un-allocated?
e.g.:
Name|Priority|GraceTime|Preempt|PreemptExemptTime|PreemptMode|Flags|UsageThres|UsageFactor|GrpTRES|GrpTRESMins|GrpTRESRunMins|GrpJobs|GrpSubmit|GrpWall|MaxTRES|MaxTRESPerNode|MaxTRESMins|MaxWall|MaxTRESPU|MaxJobsPU|MaxSubmitPU|MaxTRESPA|MaxJobsPA|MaxSubmitPA|MinTRES|
normal|0|00:00:00|||cluster|||1.000000|||||||||||cpu=3000,gres/gpu=20|||||||
dept1|1|00:00:00|||cluster|||1.000000|cpu=256,gres/gpu:1g.10gb=16,gres/gpu:2g.20gb=8,gres/gpu:3g.40gb=8,gres/gpu:a100.80gb=8|||||||||||||||||
dept2|1|00:00:00|||cluster|||1.000000|cpu=256,gres/gpu:1g.10gb=0,gres/gpu:2g.20gb=0,gres/gpu:3g.40gb=0,gres/gpu:a100.80gb=16|||||||||||||||||
So dept1 and dept2 qos are set on the same partition. How can a user with
access to one or other see if there are available resources in the partition?
--
slurm-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]