...and you shouldn't be able to do this with a QoS (I think as you want it to), as "grptresrunmins" applies to the aggregate of everything using the QoS.
On Thu, Dec 16, 2021 at 6:12 PM Fulcomer, Samuel <samuel_fulco...@brown.edu> wrote: > I've not parsed your message very far, but... > > for i in `cat limit_users` ; do > > sacctmgr where user=$i partition=foo account=bar set > grptresrunmins=cpu=Nlimit > > On Thu, Dec 16, 2021 at 6:01 PM Ross Dickson <ross.dick...@ace-net.ca> > wrote: > >> It would like to impose a time limit stricter than the partition limit on >> a certain subset of users. I should be able to do this with a QOS, but I >> can't get it to work. What am I missing? >> >> At https://slurm.schedmd.com/resource_limits.html it says, >> "Slurm's hierarchical limits are enforced in the following order ...: >> >> 1. Partition QOS limit >> 2. Job QOS limit >> 3. User association >> 4. Account association(s), ascending the hierarchy >> 5. Root/Cluster association >> 6. Partition limit >> 7. None >> >> Note: If limits are defined at multiple points in this hierarchy, the >> point in this list where the limit is first defined will be used." >> >> And there's a little more later about the Partition limit being an upper >> bound on everything. >> >> This says to me that if: >> * there is a large time limit on a partition, >> * there is a smaller time limit on the job QOS, and >> * the partition has no associated QOS, >> then the MaxWall on the Job QOS should have effect. >> >> But that's not what I observe. I've created a QOS 'nonpaying' with >> MaxWall=1-0:0:0, and set MaxTime=7-0:0:0 on partition 'general'. I set the >> association on user1 so that their job will get QOS 'nonpaying', then >> submit a job with --time=7-0:0:0, and it runs: >> >> $ scontrol show partition general | egrep 'QoS|MaxTime' >> AllocNodes=ALL Default=YES QoS=N/A >> MaxNodes=UNLIMITED MaxTime=7-00:00:00 MinNodes=0 LLN=NO >> MaxCPUsPerNode=UNLIMITED >> $ sacctmgr show qos nonpaying format=name,flags,maxwall >> Name Flags MaxWall >> ---------- -------------------- ----------- >> nonpaying 1-00:00:00 >> $ scontrol show job 33 | egrep 'QOS|JobState|TimeLimit' >> Priority=4294901728 Nice=0 Account=acad1 QOS=nonpaying >> JobState=RUNNING Reason=None Dependency=(null) >> RunTime=00:00:40 TimeLimit=7-00:00:00 TimeMin=N/A >> $ scontrol show config | grep AccountingStorageEnforce >> AccountingStorageEnforce = associations,limits,qos >> >> Help!? >> >> -- >> Ross Dickson, Computational Research Consultant >> ACENET -- Compute Canada -- Dalhousie University >> >