[slurm-dev] Interaction between cgroups and NFS

Brendan Moloney Fri, 01 Sep 2017 15:17:06 -0700

Hello,

I am using cgroups to track processes and limit memory. Occasionally it
seems like a job will use too much memory and instead of getting killed it
ends up in a unkillable state waiting for NFS I/O.  There are no other
signs of NFS issues, and in fact other jobs (even on the same node) seem to
be having no problem communicating with the same NFS server at that same
time.  I just get hung task errors for that one specific process (that used
too much memory).


Has anyone else ran into this? Searching this mailing list archive I found
some similar stuff, but that seemed to be in regards to installing Slurm
itself onto an NFS4 mount rather than just having jobs use an NFS4 mount.

Any advice is greatly appreciated.

Thanks,
Brendan

[slurm-dev] Interaction between cgroups and NFS

Reply via email to