Re: [slurm-users] Jobs can grow in RAM usage surpassing MaxMemPerNode

2023-01-13 Thread Cristóbal Navarro
Many thanks Rodrigo and Daniel, Indeed I misunderstood that part of Slurm, so thanks for clarifying this aspect now it makes a lot of sense. Regarding the approach, I went with the cgroup.conf approach as suggested by both. I will start doing some synthetic tests to make sure the job gets killed on

Re: [slurm-users] Jobs can grow in RAM usage surpassing MaxMemPerNode

2023-01-12 Thread Daniel Letai
Hello Cristóbal, I think you might have a slight misunderstanding of how Slurm works, which can cause this difference in expectation. The MaxMemPerNode is there to allow the scheduler to plan job placement according to resources. It does not enforc

Re: [slurm-users] Jobs can grow in RAM usage surpassing MaxMemPerNode

2023-01-11 Thread Rodrigo Santibáñez
Hi Cristóbal, I would guess you need to set up a cgroup.conf file ### # Slurm cgroup support configuration file ### ConstrainRAMSpace=yes ConstrainSwapSpace=yes AllowedRAMSpace=100 AllowedSwapSpace=0 MaxRAMPercent=100 MaxSwapPercent=0 #ConstrainDevices=yes MemorySwappiness=0 TaskAffinity=no Cgrou

[slurm-users] Jobs can grow in RAM usage surpassing MaxMemPerNode

2023-01-11 Thread Cristóbal Navarro
Hi Slurm community, Recently we found a small problem triggered by one of our jobs. We have a *MaxMemPerNode*=*532000* setting in our compute node in slurm.conf file, however we found out that a job that started with mem=65536, and after hours of execution it was able to grow its memory usage durin