Re: [slurm-users] slurm, memory accounting and memory mapping

Bjørn-Helge Mevik Fri, 11 Jan 2019 00:49:44 -0800

Sergey Koposov <skopo...@cmu.edu> writes:

> The trick is that my code uses memory mapping (i.e. mmap) of one
> single large file (~12 Gb) in each thread on each node.
> With this technique in the past despite the fact the file is
> (read-only) mmaped in say 16 threads, the actual memory footprint was
> still ~ 12 Gb.
> However, when I now do this in slurm, it thinks that each thread (or
> process) takes 12Gb and kills my processes.


We've seen this too (at least with older versions of Slurm; I haven't
checked lately).  Our way around it was to set

JobAcctGatherParams=NoOverMemoryKill

and use the cgroup task plugin (TaskPlugin=task/cgroup).  The cgroup
plugin will kill jobs if they exceed their limits (provided you have
set up cgroup.conf to do it), but does not have the same problem of
counting shared memory segments/mmap'ed files once for each
thread/process.  The NoOverMemoryKill tells Slurm itself not to kill the
job, but leave it to the TaskPlugin.

-- 
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo

signature.asc
Description: PGP signature

Re: [slurm-users] slurm, memory accounting and memory mapping

Reply via email to