Re: [slurm-users] MPI Jobs OOM-killed which weren't pre-21.08.5

Paul Edmon Thu, 10 Feb 2022 06:30:55 -0800

We also noticed the same thing with 21.08.5. In the 21.08 seriesSchedMD changed the way they handle cgroups to set the stage for cgroupsv2 (see: https://slurm.schedmd.com/SLUG21/Roadmap.pdf). The 21.08.5introduced a bug fix which then caused mpirun to not pin properly(particularly for older versions of MPI):https://github.com/SchedMD/slurm/blob/slurm-21-08-5-1/NEWS What we'verecommended to users who have hit this was to swap over to using sruninstead of mpirun and the situation clears up.


-Paul Edmon-


On 2/10/2022 8:59 AM, Ward Poelmans wrote:

Hi Paul,

On 10/02/2022 14:33, Paul Brunk wrote:
Now we see a problem in which the OOM killer is in some cases

predictably killing job steps who don't seem to deserve it.  In some

cases these are job scripts and input files which ran fine before our

Slurm upgrade.  More details follow, but that's it the issue in a

nutshell.
I'm not sure if this is the case but it might help to keep in mind thedifference between mpirun and srun.
With srun you let slurm create tasks with the appropriate mem/cpu etclimits and the mpi ranks will run directly in a task.
With mpirun you usually let your MPI distribution start on task pernode which will spawn the mpi manager which will start the actual mpiprogram.
You might very well end up with different memory limits per processwhich could be the cause of your OOM issue. Especially if not all MPIranks use the same amount of memory.
Ward

Re: [slurm-users] MPI Jobs OOM-killed which weren't pre-21.08.5

Reply via email to