Hi Gérard, Let see if I understood this right. You have a user on the account dci and you have put GrpTRESMins limit on this (cpu=4100). From the output it looks like that is associated to the QoS toto. However the limit put on the association and not on the QoS:
> GrpTRESMins=cpu=N(4227) You need to remove the limit from the association and put it on the QoS. Hope that helps, MAO > On 30 Jun 2022, at 19:12, gerard....@cines.fr wrote: > > Hi Miguel, > > I finally found the time to test the QOS NoDecay configuration vs GrpTRESMins > account limit. > > Here is my benchmark : > > > 1) Initialize the benchmark configuration > - reset all RawUsage (on QOS and account) > - set a limit on Account GrpTRESMins > - run several jobs with a controlled ellaps cpu time on a QOS. > - reset account RawUsage > - set a limit on Account GrpTRESMins under the QOS RawUsage > > Here is the inital state before running the benchmark > > toto@login1:~/TEST$ sshare -A dci -u " " -o > account,user,GrpTRESRaw%80,GrpTRESMins,rawusage > Account User > GrpTRESRaw GrpTRESMins RawUsage > -------------------- ---------- > ----------------------------------------------------- > ------------------------------ ----------- > dci > cpu=0,mem=0,energy=0,node=0,billing=0,fs/disk=0,vmem=0,pages=0 > cpu=4100 0 > > > > Account RawUsage = 0 > GrpTRESMins cpu=4100 > > > > toto@login1:~/TEST$ scontrol -o show assoc_mgr | grep "^QOS" | grep support > QOS=support(8) UsageRaw=253632.000000 GrpJobs=N(0) GrpJobsAccrue=N(0) > GrpSubmitJobs=N(0) GrpWall=N(132.10) > GrpTRES=cpu=N(0),mem=N(0),energy=N(0),node=2106(0),billing=N(0),fs/disk=N(0),vmem=N(0),pages=N(0) > > GrpTRESMins=cpu=N(4227),mem=N(7926000),energy=N(0),node=N(132),billing=N(4227),fs/disk=N(0),vmem=N(0),pages=N(0) > > GrpTRESRunMins=cpu=N(0),mem=N(0),energy=N(0),node=N(0),billing=N(0),fs/disk=N(0),vmem=N(0),pages=N(0) > MaxWallPJ=1440 MaxTRESPJ=node=700 MaxTRESPN= MaxTRESMinsPJ= MinPrioThresh= > MinTRESPJ= PreemptMode=OFF Priority=10 Account Limits= dci={MaxJobsPA=N(0) > MaxJobsAccruePA=N(0) MaxSubmitJobsPA=N(0) > MaxTRESPA=cpu=N(0),mem=N(0),energy=N(0),node=N(0),billing=N(0),fs/disk=N(0),vmem=N(0),pages=N(0)} > User Limits= 1145={MaxJobsPU=N(0) MaxJobsAccruePU=N(0) MaxSubmitJobsPU=N(0) > MaxTRESPU=cpu=N(0),mem=N(0),energy=N(0),node=2106(0),billing=N(0),fs/disk=N(0),vmem=N(0),pages=N(0)} > > QOS support RawUsage = 253632 s or 4227 mn > > > QOS support RawUsage > GrpTRESMins SLURM should prevent to start a job > for this account if it works as expected. > > > > 2) Run the benchmark to control limit GrpTRESMins efficiency over QOS > rawusage > > > toto@login1:~/TEST$ sbatch TRESMIN.slurm > Submitted batch job 3687 > > > toto@login1:~/TEST$ squeue > JOBIDADMIN_COMMMIN_MEMOR SUBMIT_TIME PRIORITY PARTITION > QOS USER STATE TIME_LIMIT TIME NODES REASON > START_TIME > 3687 BDW28 60000M 2022-06-30T19:36:42 1100000 bdw28 > support toto RUNNING 5:00 0:02 1 None > 2022-06-30T19:36:42 > > > The job is running unless GrpTRESMins is under QOS support RawUsage . > > > > Is there anything wrong with my control process that invalidates the result ? > > > Thanks > > Gérard > > <http://www.cines.fr/> > > De: "gerard gil" <gerard....@cines.fr> > À: "Slurm-users" <slurm-users@lists.schedmd.com> > Envoyé: Mercredi 29 Juin 2022 19:13:56 > Objet: Re: [slurm-users] GrpTRESMins and GrpTRESRaw usage > Hi Miguel, > > >If I understood you correctly your goal was to limit the number of minutes > >each project can run. By associating each project to a slurm account with a > >nodecay QoS then you will have achieved your goal. > > Here is what I what to do : > > "All jobs submitted to an account regardless the QOS they use have to be > constrained to a number of minutes set by the limit associated with that > account (and not to QOS)." > > > >Try a project with a very small limit and you will see that it won’t run > > I already tested GrpTRESmins limit and confirms it works as expected. > Then I saw the decay effect on GrpTRESRaw (what I thought first as the right > metric to look at) and try to find out a way to fix it. > > It's really very import for me to trust it, so I need a deterministic test to > prove it. > > I'm testing this GrpTRESMins limit with NoDecay set on QOS resetting all > RawUsage (Account and QOS) to be sure it works as I expect. > I print the account GrpTRESRaw (in mn) at the end of my tests job to set a > new limits with GrpTRESMins and see how it behaves. > > I'll get inform on the results. I hope it works. > > > > You don’t have to add anything. > >Each QoS will accumulate its respective usage, i.e, the usage of all users > >on that account. Users can even be on different accounts (projects) and > >charge the respective project with the parameter --account on sbatch. > > If SLURM does it for to manage limit I would also like to obtain the current > RawUsage for an account. > Do you know how to get it ? > > > > >The GrpTRESMins is always changed on the QoS with a command like: > > > >sacctmgr update qos where qos=... set GrpTRESMin=cpu=…. > > That's right if you want to set a limit to a QOS. > But I dont know/think the same limit value will also apply to all other QOS, > and if I apply the same limit to all QOS. > Is my account limit the sum of all the QOS limit ? > > > Actualy I'm setting the limit to the Account using command: > > sacctmgr modify account myaccount set grptresmins=cpu=60000 qos=... > > With this setting I saw the limit is set to the account and not to the QOS. > sacctmgr show QOS command shows an empty field for GrpTRESMins on all QOS > > > Thanks again form your help. > I hope I'm close to get the answer to my issue. > > Best, > Gérard > <http://www.cines.fr/> > > De: "Miguel Oliveira" <miguel.olive...@uc.pt> > À: "Slurm-users" <slurm-users@lists.schedmd.com> > Envoyé: Mercredi 29 Juin 2022 01:28:58 > Objet: Re: [slurm-users] GrpTRESMins and GrpTRESRaw usage > Hi Gérard, > > If I understood you correctly your goal was to limit the number of minutes > each project can run. By associating each project to a slurm account with a > nodecay QoS then you will have achieved your goal. > Try a project with a very small limit and you will see that it won’t run. > > You don’t have to add anything. Each QoS will accumulate its respective > usage, i.e, the usage of all users on that account. Users can even be on > different accounts (projects) and charge the respective project with the > parameter --account on sbatch. > The GrpTRESMins is always changed on the QoS with a command like: > > sacctmgr update qos where qos=... set GrpTRESMin=cpu=…. > > Hope that makes sense! > > Best, > > MAO > > On 28 Jun 2022, at 18:30, gerard....@cines.fr <mailto:gerard....@cines.fr> > wrote: > > Hi Miguel, > > OK, I did'nt know this command. > > I'm not sure to understand how it works regarding to my goal. > I use the following command inspired by the command you gave me and I obtain > a UsageRaw for each QOS. > > scontrol -o show assoc_mgr -accounts=myaccount Users=" " > > > Do I have to sumup all QOS RawUsage to obtain the RawUsage of myaccount with > NoDecay ? > If I set GrpTRESMins for an Account and not for a QOS, does SLURM handle to > sumpup these QOS RawUsage to control if the GrpTRESMins account limit is > reach ? > > Thanks again for your precious help. > > Gérard > <http://www.cines.fr/> > > De: "Miguel Oliveira" <miguel.olive...@uc.pt <mailto:miguel.olive...@uc.pt>> > À: "Slurm-users" <slurm-users@lists.schedmd.com > <mailto:slurm-users@lists.schedmd.com>> > Envoyé: Mardi 28 Juin 2022 17:23:18 > Objet: Re: [slurm-users] GrpTRESMins and GrpTRESRaw usage > Hi Gérard, > > The way you are checking is against the association and as such it ought to > be decreasing in order to be used by fair share appropriately. > The counter used that does not decrease is on the QoS, not the association. > You can check that with: > > scontrol -o show assoc_mgr | grep "^QOS='+account+’” > > That ought to give you two numbers. The first is the limit, or N for not > limit, and the second in parenthesis the usage. > > Hope that helps. > > Best, > > Miguel Afonso Oliveira > > On 28 Jun 2022, at 08:58, gerard....@cines.fr <mailto:gerard....@cines.fr> > wrote: > > Hi Miguel, > > > I modified my test configuration to evaluate the effect of NoDecay. > > > > > I modified all QOS adding NoDecay Flag. > > > toto@login1:~/TEST$ sacctmgr show QOS > Name Priority GraceTime Preempt PreemptExemptTime PreemptMode > Flags UsageThres UsageFactor GrpTRES > GrpTRESMins GrpTRESRunMin GrpJobs GrpSubmit GrpWall MaxTRES > MaxTRESPerNode MaxTRESMins MaxWall MaxTRESPU MaxJobsPU MaxSubmitPU > MaxTRESPA MaxJobsPA MaxSubmitPA MinTRES > ---------- ---------- ---------- ---------- ------------------- ----------- > ---------------------------------------- ---------- ----------- ------------- > ------------- ------------- ------- --------- ----------- ------------- > -------------- ------------- ----------- ------------- --------- ----------- > ------------- --------- ----------- ------------- > normal 0 00:00:00 cluster > NoDecay 1.000000 > > > > interactif 10 00:00:00 cluster > NoDecay 1.000000 node=50 > node=22 > 1-00:00:00 node=50 > > petit 4 00:00:00 cluster > NoDecay 1.000000 node=1500 > node=22 > 1-00:00:00 node=300 > > gros 6 00:00:00 cluster > NoDecay 1.000000 node=2106 > node=700 > 1-00:00:00 node=700 > > court 8 00:00:00 cluster > NoDecay 1.000000 node=1100 > node=100 > 02:00:00 node=300 > > long 4 00:00:00 cluster > NoDecay 1.000000 node=500 > node=200 > 5-00:00:00 node=200 > > special 10 00:00:00 cluster > NoDecay 1.000000 node=2106 > node=2106 > 5-00:00:00 node=2106 > > support 10 00:00:00 cluster > NoDecay 1.000000 node=2106 > node=700 > 1-00:00:00 node=2106 > > visu 10 00:00:00 cluster > NoDecay 1.000000 node=4 > node=700 > 06:00:00 node=4 > > > > I submitted a bunch of jobs to control the NoDecay efficiency and I noticed > RawUsage as well as GrpTRESRaw cpu is still decreasing. > > > toto@login1:~/TEST$ sshare -A dci -u " " -o > account,user,GrpTRESRaw%80,GrpTRESMins,RawUsage > Account User > GrpTRESRaw GrpTRESMins RawUsage > -------------------- ---------- > ----------------------------------------------------- > ------------------------------ ----------- > dci > cpu=6932,mem=12998963,energy=0,node=216,billing=6932,fs/disk=0,vmem=0,pages=0 > cpu=17150 415966 > toto@login1:~/TEST$ sshare -A dci -u " " -o > account,user,GrpTRESRaw%80,GrpTRESMins,RawUsage > Account User > GrpTRESRaw GrpTRESMins RawUsage > -------------------- ---------- > ----------------------------------------------------- > ------------------------------ ----------- > dci > cpu=6931,mem=12995835,energy=0,node=216,billing=6931,fs/disk=0,vmem=0,pages=0 > cpu=17150 415866 > toto@login1:~/TEST$ sshare -A dci -u " " -o > account,user,GrpTRESRaw%80,GrpTRESMins,RawUsage > Account User > GrpTRESRaw GrpTRESMins RawUsage > -------------------- ---------- > ----------------------------------------------------- > ------------------------------ ----------- > dci > cpu=6929,mem=12992708,energy=0,node=216,billing=6929,fs/disk=0,vmem=0,pages=0 > cpu=17150 415766 > > > Something I forgot to do ? > > > Best, > Gérard > > Cordialement, > Gérard Gil > > Département Calcul Intensif > Centre Informatique National de l'Enseignement Superieur > 950, rue de Saint Priest > 34097 Montpellier CEDEX 5 > FRANCE > > tel : (334) 67 14 14 14 > fax : (334) 67 52 37 63 > web : http://www.cines.fr <http://www.cines.fr/> > > De: "Gérard Gil" <gerard....@cines.fr <mailto:gerard....@cines.fr>> > À: "Slurm-users" <slurm-users@lists.schedmd.com > <mailto:slurm-users@lists.schedmd.com>> > Cc: "slurm-users" <slurm-us...@schedmd.com <mailto:slurm-us...@schedmd.com>> > Envoyé: Vendredi 24 Juin 2022 14:52:12 > Objet: Re: [slurm-users] GrpTRESMins and GrpTRESRaw usage > Hi Miguel, > > Good !! > > I'll try this options on all existing QOS and see if everything works as > expected. > I'll inform you on the results. > > > Thanks a lot > > Best, > Gérard > > > ----- Mail original ----- > De: "Miguel Oliveira" <miguel.olive...@uc.pt <mailto:miguel.olive...@uc.pt>> > À: "Slurm-users" <slurm-users@lists.schedmd.com > <mailto:slurm-users@lists.schedmd.com>> > Cc: "slurm-users" <slurm-us...@schedmd.com <mailto:slurm-us...@schedmd.com>> > Envoyé: Vendredi 24 Juin 2022 14:07:16 > Objet: Re: [slurm-users] GrpTRESMins and GrpTRESRaw usage > > > Hi Gérard, > > I believe so. All our accounts correspond to one project and all have an > associated QoS with NoDecay and DenyOnLimit. This is enough to restrict usage > on each individual project. > You only need these flags on the QoS. The association will carry on as usual > and > fairshare will not be impacted. > > Hope that helps, > > Miguel Oliveira > > On 24 Jun 2022, at 12:56, gerard....@cines.fr <mailto:gerard....@cines.fr> > wrote: > > Hi Miguel, > > Why not? You can have multiple QoSs and you have other techniques to change > priorities according to your policies. > > > Is this answer my question ? > > "If all configured QOS use NoDecay, we can take advantage of the FairShare > priority with Decay and all jobs GrpTRESRaw with NoDecay ?" > > Thanks > > Best, > > > > Gérard > > > > >
smime.p7s
Description: S/MIME cryptographic signature