
The problem comes from if the login nodes (or submission hosts) have different 
ulimits – maybe the submission hosts are VMs and not physical servers.  Then 
the ulimits will be passed from submission hosts in Slurm to the jobs compute 
node by default which can results in different settings being applied.  If the 
login nodes have the same ulimit settings then you may not see a difference.

We happened to see a difference due to moving to a virtualised login node 
infrastructure which has slightly different settings applied.

Does that make sense?

I also missed that setting in slurm.conf so good to know it is possible to 
change the default behaviour.


From: Patryk Bełzak via slurm-users <slurm-users@lists.schedmd.com>
Date: Friday, 17 May 2024 at 10:15
To: Dj Merrill <sl...@deej.net>
Cc: slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com>
Subject: [slurm-users] Re: srun weirdness
External email to Cardiff University - Take care when replying/opening 
attachments or links.
Nid ebost mewnol o Brifysgol Caerdydd yw hwn - Cymerwch ofal wrth ateb/agor 
atodiadau neu ddolenni.


I wonder where does this problems come from, perhaps I am missing something, 
but we never had such issues with limits since we have it set on worker nodes 
in /etc/security/limits.d/99-cluster.conf:

*       soft    memlock 4086160 #Allow more Memory Locks for MPI
*       hard    memlock 4086160 #Allow more Memory Locks for MPI
*       soft    nofile  1048576 #Increase the Number of File Descriptors
*       hard    nofile  1048576 #Increase the Number of File Descriptors
*       soft    stack   unlimited       #Set soft to hard limit
*       soft    core    4194304 #Allow Core Files

and it sets up all limits we want without any problems, and there is no need to 
pass extra arguments to slurm commands or modify the config file.


On 24/05/15 02:26, Dj Merrill via slurm-users wrote:
[-- Type: text/plain; charset=US-ASCII, Encoding: 7bit, Size: 0,2K --]
> I completely missed that, thank you!
> -Dj
> Laura Hild via slurm-users wrote:
> > PropagateResourceLimitsExcept won't do it?
> Sarlo, Jeffrey S wrote:
> > You might look at the PropagateResourceLimits and 
> > PropagateResourceLimitsExcept settings in slurm.conf

[-- Alternative Type #1: text/html; charset=UTF-8, Encoding: 8bit, Size: 1,0K 

> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to