Re: [slurm-users] Wedged nodes from cgroups, OOM killer, and D state process

Christopher Benjamin Coffey Fri, 07 Dec 2018 10:07:55 -0800

Is this parameter applied to each cgroup? Or just the system itself? Seems like 
just the system itself.


—
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167
 

On 12/4/18, 10:13 AM, "slurm-users on behalf of Christopher Benjamin Coffey" 
<slurm-users-boun...@lists.schedmd.com on behalf of chris.cof...@nau.edu> wrote:

    Interesting! I'll have a look - thanks!
    
    —
    Christopher Coffey
    High-Performance Computing
    Northern Arizona University
    928-523-1167
     
    
    On 11/30/18, 1:41 AM, "slurm-users on behalf of John Hearns" 
<slurm-users-boun...@lists.schedmd.com on behalf of hear...@googlemail.com> 
wrote:
    
        Chris, I have delved deep into the OOM killer code and interaction with 
cpusets in the past (*).
        That experience is not really relevant!
        However I always recommend looking at this sysctl parameter   
min_free_kbytes
        
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Faccess.redhat.com%2Fdocumentation%2Fen-us%2Fred_hat_enterprise_linux%2F6%2Fhtml%2Fperformance_tuning_guide%2Fs-memory-tunables&amp;data=02%7C01%7Cchris.coffey%40nau.edu%7Cc1e7409fe2ed4179d8fc08d65a0bdb3d%7C27d49e9f89e14aa099a3d35b57b2ba03%7C0%7C0%7C636795404305835104&amp;sdata=A64IMQq3pgk4rpZL8NqOn2RblY%2BbG7zyuSsVUm6TvjQ%3D&amp;reserved=0
 
<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Faccess.redhat.com%2Fdocumentation%2Fen-us%2Fred_hat_enterprise_linux%2F6%2Fhtml%2Fperformance_tuning_guide%2Fs-memory-tunables&amp;data=02%7C01%7Cchris.coffey%40nau.edu%7Cc1e7409fe2ed4179d8fc08d65a0bdb3d%7C27d49e9f89e14aa099a3d35b57b2ba03%7C0%7C0%7C636795404305835104&amp;sdata=A64IMQq3pgk4rpZL8NqOn2RblY%2BbG7zyuSsVUm6TvjQ%3D&amp;reserved=0>
        
        I think of this as the 'wriggle room' the system has when it is playing 
Tetris with memory pages.
        In there past the min_free_kbytes value was by default ridiculously 
low. Look at what value you have and consider increasing by a big factor.
        
        
        
        
        (*)A long time ago, when managing a large NUMA machine, I had an 
application which would have a memory leak and trigger the OOM killer.
        The application was being run in a cgroup with cpuset and memory 
location limits. Once the OOM killer kicked in I saw that the system went 'off 
the air' for two minutes, which was quite serious. Finally traced to the OOM 
killer inspecting every page in
         memory on the system, independent of the cgroup, to see if it had been 
touched by this process.
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        
        On Fri, 30 Nov 2018 at 09:31, Ole Holm Nielsen 
<ole.h.niel...@fysik.dtu.dk> wrote:
        
        
        On 29-11-2018 19:27, Christopher Benjamin Coffey wrote:
        > We've been noticing an issue with nodes from time to time that become 
"wedged", or unusable. This is a state where ps, and w hang. We've been looking 
into this for a while when we get time and finally put some more effort into it 
yesterday. We came across
         this blog which describes almost the exact scenario:
        > 
        > 
        
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Frachelbythebay.com%2Fw%2F2014%2F10%2F27%2Fps%2F&amp;data=02%7C01%7Cchris.coffey%40nau.edu%7Cc1e7409fe2ed4179d8fc08d65a0bdb3d%7C27d49e9f89e14aa099a3d35b57b2ba03%7C0%7C0%7C636795404305835104&amp;sdata=oVwSGUpyG5Kht3%2FSYMFL578keuWFS%2BA2cFAz4gRm0DA%3D&amp;reserved=0
 
<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Frachelbythebay.com%2Fw%2F2014%2F10%2F27%2Fps%2F&amp;data=02%7C01%7Cchris.coffey%40nau.edu%7Cc1e7409fe2ed4179d8fc08d65a0bdb3d%7C27d49e9f89e14aa099a3d35b57b2ba03%7C0%7C0%7C636795404305845112&amp;sdata=bqcw15RYGUVIGgU8Q08WEU6b9ymyqXfiK56zwHJNZ70%3D&amp;reserved=0>
        > 
        > It has nothing to do with Slurm, but it does have to do with cgroups 
which we have enabled. It appears that processes that have hit their ceiling 
for memory and should be killed by oom-killer, and are in D state at the same 
time, cause the system to become
         wedged. For each node wedged, I've found a job out in:
        > 
        > /cgroup/memory/slurm/uid_3665/job_15363106/step_batch
        > - memory.max_usage_in_bytes
        > - memory.limit_in_bytes
        > 
        > The two files are the same bytes, which I'd think would be a 
candidate for oom-killer. But memory.oom_control says:
        > 
        > oom_kill_disable 0
        > under_oom 0
        > 
        > My feeling is that the process was in D state, the oom-killer tried 
to be invoked, but then didn't and the system became wedged.
        > 
        > Has anyone run into this? If so, whats the fix? Apologies if this has 
been discussed before, I haven't noticed it on the group.
        > 
        > I wonder if it’s a bug in the oom-killer? Maybe it's been patched in 
a more recent kernel but looking at the kernels in the 6.10 series it doesn't 
look like a newer one would have a patch for a oom-killer bug.
        > 
        > Our setup is:
        > 
        > Centos 6.10
        > 2.6.32-642.6.2.el6.x86_64
        > Slurm 17.11.12
        
        As far as I remember, Cgroups underwent a major upgrade with 
RHEL/CentOS 
        7.  Maybe you should try to upgrade to CentOS 7.5 :-)
        
        /Ole

Re: [slurm-users] Wedged nodes from cgroups, OOM killer, and D state process

Reply via email to