The trick, I think (and Guillaume can certainly correct me) is that the aim is 
to allow the user to run as many (up to) 200G mem jobs as they want....so long 
as they do not consume more than 200G on any single node.  So, they could run 
10 200G jobs....on 10 different nodes.  So the mem limit isn't per user...it's 
per user per node.  I think the qos limit you created below works as an OVERALL 
limit for the user, but doesn't allow a per-node limiting.

Rob


________________________________
From: Carsten Beyer via slurm-users <slurm-users@lists.schedmd.com>
Sent: Wednesday, September 25, 2024 7:27 AM
To: Guillaume COCHARD <guillaume.coch...@cc.in2p3.fr>
Cc: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: [slurm-users] Re: Max TRES per user and node


Hi Guillaume,


as Rob it already mentioned, this could maybe a way for you (partition just 
created temporarily online for testing). You could also add your MaxTRES=node=1 
for more restrictions. We do something similar with QOS to restrict the number 
of CPU's for user in certain partitions.


sacctmgr create qos name=maxtrespu200G maxtrespu=mem=200G flags=denyonlimit


scontrol create partition=testtres qos=maxtrespu200g maxtime=08:00:00 
nodes=lt[10000-10003] DefMemPerCPU=940 MaxMemPerCPU=940 OverSubscribe=NO


That results in:


4 jobs with 100G each:

---
[root@levantetest ~]# squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES 
NODELIST(REASON)
               862  testtres hostname  xxxxxxx PD       0:00      1 
(QOSMaxMemoryPerUser)
               861  testtres hostname  xxxxxxx PD       0:00      1 
(QOSMaxMemoryPerUser)
               860  testtres hostname  xxxxxxx  R       0:15      1 lt10000
               859  testtres hostname  xxxxxxx  R       0:22      1 lt10000


6 jobs with 50G each:

---
[k202068@levantetest ~]$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES 
NODELIST(REASON)
               876  testtres hostname  xxxxxxx PD       0:00      1 
(QOSMaxMemoryPerUser)
               875  testtres hostname  xxxxxxx PD       0:00      1 
(QOSMaxMemoryPerUser)
               874  testtres hostname  xxxxxxx  R       9:09      1 lt10000
               873  testtres hostname  xxxxxxx  R       9:15      1 lt10000
               872  testtres hostname  xxxxxxx  R       9:22      1 lt10000
               871  testtres hostname  xxxxxxx  R       9:26      1 lt10000


Best Regrads,
Carsten


--
Carsten Beyer
Abteilung Systeme

Deutsches Klimarechenzentrum GmbH (DKRZ)
Bundesstraße 45a * D-20146 Hamburg * Germany

Phone:  +49 40 460094-221
Fax:    +49 40 460094-270
Email:  be...@dkrz.de<mailto:be...@dkrz.de>
URL:    http://www.dkrz.de<http://www.dkrz.de/>

Geschäftsführer: Prof. Dr. Thomas Ludwig
Sitz der Gesellschaft: Hamburg
Amtsgericht Hamburg HRB 39784




Am 24.09.24 um 16:58 schrieb Guillaume COCHARD via slurm-users:
> "So if they submit a 2nd job, that job can start but will have to go onto 
> another node, and will again be restricted to 200G?  So they can start as 
> many jobs as there are nodes, and each job will be restricted to using 1 node 
> and 200G of memory?"

Yes that's it. We already have MaxNodes=1 so a job can't be spread on multiple 
nodes.

To be more precise, the limit should be by user and not by job. To illustrate, 
let's imagine we have 3 empty nodes and a 200G/user/node limit. If a user 
submit 10 jobs each requesting 100G of memory, there should be 2 jobs running 
on each worker and 4 jobs pending.

Guillaume

________________________________
De: "Groner, Rob" <rug...@psu.edu><mailto:rug...@psu.edu>
À: "Guillaume COCHARD" 
<guillaume.coch...@cc.in2p3.fr><mailto:guillaume.coch...@cc.in2p3.fr>
Cc: slurm-users@lists.schedmd.com<mailto:slurm-users@lists.schedmd.com>
Envoyé: Mardi 24 Septembre 2024 16:37:34
Objet: Re: Max TRES per user and node

Ah, sorry, I didn't catch that from your first post (though you did say it).

So, you are trying to limit the user to no more than 200G of memory on a single 
node?  So if they submit a 2nd job, that job can start but will have to go onto 
another node, and will again be restricted to 200G?  So they can start as many 
jobs as there are nodes, and each job will be restricted to using 1 node and 
200G of memory? Or can they submit a job asking for 4 nodes, where they are 
limited to 200G on each node?  Or are they limited to a single node, no matter 
how many jobs?

Rob

________________________________
From: Guillaume COCHARD 
<guillaume.coch...@cc.in2p3.fr><mailto:guillaume.coch...@cc.in2p3.fr>
Sent: Tuesday, September 24, 2024 10:09 AM
To: Groner, Rob <rug...@psu.edu><mailto:rug...@psu.edu>
Cc: slurm-users@lists.schedmd.com<mailto:slurm-users@lists.schedmd.com> 
<slurm-users@lists.schedmd.com><mailto:slurm-users@lists.schedmd.com>
Subject: Re: Max TRES per user and node

Thank you for your answer.

To test it I tried:
sacctmgr update qos normal set maxtresperuser=cpu=2
# Then in slurm.conf
PartitionName=test […] qos=normal

But then if I submit several 1-cpu jobs only two start and the others stay 
pending, even though I have several nodes available. So it seems that 
MaxTRESPerUser is a QoS-wide limit, and doesn't limit TRES per user and per 
node but rather per user and QoS (or rather partition since I applied the QoS 
on the partition). Did I miss something?

Thanks again,
Guillaume

________________________________
De: "Groner, Rob" <rug...@psu.edu><mailto:rug...@psu.edu>
À: slurm-users@lists.schedmd.com<mailto:slurm-users@lists.schedmd.com>, 
"Guillaume COCHARD" 
<guillaume.coch...@cc.in2p3.fr><mailto:guillaume.coch...@cc.in2p3.fr>
Envoyé: Mardi 24 Septembre 2024 15:45:08
Objet: Re: Max TRES per user and node

You have the right idea.

On that same page, you'll find MaxTRESPerUser, as a QOS parameter.

You can create a QOS with the restrictions you'd like, and then in the 
partition definition, you give it that QOS.  The QOS will then apply its 
restrictions to any jobs that use that partition.

Rob
________________________________
From: Guillaume COCHARD via slurm-users 
<slurm-users@lists.schedmd.com><mailto:slurm-users@lists.schedmd.com>
Sent: Tuesday, September 24, 2024 9:30 AM
To: slurm-users@lists.schedmd.com<mailto:slurm-users@lists.schedmd.com> 
<slurm-users@lists.schedmd.com><mailto:slurm-users@lists.schedmd.com>
Subject: [slurm-users] Max TRES per user and node

Hello,

We are looking for a method to limit the TRES used by each user on a per-node 
basis. For example, we would like to limit the total memory allocation of jobs 
from a user to 200G per node.

There is MaxTRESperNode 
(https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fsacctmgr.html%23OPT_MaxTRESPerNode&data=05%7C02%7Crug262%40psu.edu%7Ca5ac74d119fb4b1e2a6a08dcdc9d71f4%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C0%7C638627815993703402%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=ovXl4if01XtEDBQy3GxOG%2BrpH1GiDYFEOjNtz7gpkUs%3D&reserved=0<https://slurm.schedmd.com/sacctmgr.html#OPT_MaxTRESPerNode>),
 but unfortunately, this is a per-job limit, not per user.

Ideally, we would like to apply this limit on partitions and/or QoS. Does 
anyone know if this is possible and how to achieve it?

Thank you,

--
slurm-users mailing list -- 
slurm-users@lists.schedmd.com<mailto:slurm-users@lists.schedmd.com>
To unsubscribe send an email to 
slurm-users-le...@lists.schedmd.com<mailto:slurm-users-le...@lists.schedmd.com>



-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to