It depends on a number of factors.

How do your workloads behave?  Do they do a lot of fork()?  I’ve had cases in 
the past where users submitted scripts which initially used quite a lot of 
memory and then used fork() or system() to execute subprocesses.  This of 
course means that temporarily (between the fork() and the exec() system calls) 
the job uses twice as much virtual memory, although this does not become real 
because the pages are copy-on-write.  Something similar happens if the code 
performs mmap() on large files.

Whether this has an impact on you needing swap space is down to what your  
sysctl settings are for vm.overcommit_memory and vm.overcommit_ratio

If you set vm.overcommit_memory to 2, then the OOM killer will never hit you 
(because malloc() will fail rather than allocate virtual memory that isn’t 
available), but cases like the above will tend to fail memory allocations 
unnecessarily, especially if you don’t have any swap allocated.

If you set vm.overcommit_memory to 0 or 1, then you need less swap allocated 
(possibly even zero) but you run the risk of running out of memory and the OOM 
killer blowing things up left right and centre.

If you provide swap, it only causes a performance impact if the node actually 
runs out of physical memory and actively starts swapping.

So bottom line is I think it depends on what you want the failure mode to be.


  1.  If you want everything to always run in a very deterministic way at full 
speed, with failures at the precise moment the memory is exhausted, but with a 
risk that jobs fail if they’re relying on overcommit (e.g. through 
fork(0/exec()), then vm.overcommit_memory=2 and no swap
  2.  If you want high throughput single threaded stuff to run more smoothly 
(think:  horrible genomics perl and python scrips, etc), then 
overcommit_memory=0 and add some swap.  You’ll probably get higher throughput, 
but things may blow up slight unpredictably from time to time when nodes run 
out of memory.

I now call on someone who understands cgroups properly to explain how this 
changes when cgroups are in play, because I’m not sure I understand that!

Tim


--
Tim Cutts
Scientific Computing Platform Lead
AstraZeneca

Find out more about R&D IT Data, Analytics & AI and how we can support you by 
visiting our Service 
Catalogue<https://azcollaboration.sharepoint.com/sites/CMU993> |


From: John Joseph via slurm-users <slurm-users@lists.schedmd.com>
Date: Monday, 4 March 2024 at 07:06
To: slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com>
Subject: [slurm-users] Is SWAP memory mandatory for SLURM
Dear All,
Good morning
I do have a 4 node SLURM instance up and running.
Like to know if I disable the SWAP memory, will it effect the SLURM performance
Is SWAP a mandatory requirement, I have each node more RAM, if my phsicall RAM 
is more, is there any need for the SWAP
thanks
Joseph John

________________________________

AstraZeneca UK Limited is a company incorporated in England and Wales with 
registered number:03674842 and its registered office at 1 Francis Crick Avenue, 
Cambridge Biomedical Campus, Cambridge, CB2 0AA.

This e-mail and its attachments are intended for the above named recipient only 
and may contain confidential and privileged information. If they have come to 
you in error, you must not copy or show them to anyone; instead, please reply 
to this e-mail, highlighting the error to the sender and then immediately 
delete the message. For information about how AstraZeneca UK Limited and its 
affiliates may process information, personal data and monitor communications, 
please see our privacy notice at 
www.astrazeneca.com<https://www.astrazeneca.com>
-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to