Slurm will allocate more cpus to cover the memory requirement. Use sacct's query fields to compare Requested Resources vs. Allocated Resources:
$ scontrol show part normal_q | grep MaxMem DefMemPerCPU=1920 MaxMemPerCPU=1920 $ srun -n 1 --mem-per-cpu=4000 --partition=normal_q --account=arcadm hostname srun: job 1577313 queued and waiting for resources srun: job 1577313 has been allocated resources tc095 $ sacct -j 1577313 -o jobid,reqtres%35,alloctres%35 JobID ReqTRES AllocTRES ------------ ----------------------------------- ----------------------------------- 1577313 billing=1,cpu=1,mem=4000M,node=1 billing=3,cpu=3,mem=4002M,node=1 1577313.ext+ billing=3,cpu=3,mem=4002M,node=1 1577313.0 cpu=3,mem=4002M,node=1 >From the Slurm manuals (eg. man srun): --mem-per-cpu=<size>[units] Minimum memory required per allocated CPU. ... Note that if the job's --mem-per-cpu value exceeds the configured MaxMemPerCPU, then the user's limit will be treated as a memory limit per task On Mon, Jul 24, 2023 at 9:32 AM Groner, Rob <rug...@psu.edu> wrote: > I'm not sure I can help with the rest, but the EnforcePartLimits setting > will only reject a job at submission time that exceeds *partition* > limits, not overall cluster limits. I don't see anything, offhand, in the > interactive partition definition that is exceeded by your request for 4 > GB/CPU. > > Rob > > > ------------------------------ > *From:* slurm-users on behalf of Angel de Vicente > *Sent:* Monday, July 24, 2023 7:20 AM > *To:* Slurm User Community List > *Subject:* [slurm-users] MaxMemPerCPU not enforced? > > Hello, > > I'm trying to get Slurm to control the memory used per CPU, but it does > not seem to enforce the MaxMemPerCPU option in slurm.conf > > This is running in Ubuntu 22.04 (cgroups v2), Slurm 23.02.3. > > Relevant configuration options: > > ,----cgroup.conf > | AllowedRAMSpace=100 > | ConstrainCores=yes > | ConstrainRAMSpace=yes > | ConstrainSwapSpace=yes > | AllowedSwapSpace=0 > `---- > > ,----slurm.conf > | TaskPlugin=task/affinity,task/cgroup > | PrologFlags=X11 > | > | SelectType=select/cons_res > | SelectTypeParameters=CR_CPU_Memory,CR_CORE_DEFAULT_DIST_BLOCK > | MaxMemPerCPU=500 > | DefMemPerCPU=200 > | > | JobAcctGatherType=jobacct_gather/linux > | > | EnforcePartLimits=ALL > | > | NodeName=xxx RealMemory=257756 Sockets=4 CoresPerSocket=8 > ThreadsPerCore=1 Weight=1 > | > | PartitionName=batch Nodes=duna State=UP Default=YES > MaxTime=2-00:00:00 MaxCPUsPerNode=32 OverSubscribe=FORCE:1 > | PartitionName=interactive Nodes=duna State=UP Default=NO > MaxTime=08:00:00 MaxCPUsPerNode=32 OverSubscribe=FORCE:2 > `---- > > > I can ask for an interactive session with 4GB/CPU (I would have thought > that "EnforcePartLimits=ALL" would stop me from doing that), and once > I'm in the interactive session I can execute a 3GB test code without any > issues (I can see with htop that the process does indeed use a RES size > of 3GB at 100% CPU use). Any idea what could be the problem or how to > start debugging this? > > ,---- > | [angelv@xxx test]$ sinter -n 1 --mem-per-cpu=4000 > | salloc: Granted job allocation 127544 > | salloc: Nodes xxx are ready for job > | > | (sinter) [angelv@xxx test]$ stress -m 1 -t 600 --vm-keep --vm-bytes 3G > | stress -m 1 -t 600 --vm-keep --vm-bytes 3G > | stress: info: [1772392] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd > `---- > > Many thanks, > -- > Ángel de Vicente > Research Software Engineer (Supercomputing and BigData) > Tel.: +34 922-605-747 > Web.: http://research.iac.es/proyecto/polmag/ > > GPG: 0x8BDC390B69033F52 >