led puppet overnight, jobs running longer than 30 minutes are
completing, and cgroups are persisting, whereas before that, they were not.
--nate
On Mon, Apr 30, 2018 at 5:47 PM, Andy Georges wrote:
>
>
> > On 30 Apr 2018, at 22:37, Nate Coraor wrote:
> >
> > Hi Shawn,
>
n Mon, Apr 30, 2018 at 4:37 PM, Nate Coraor wrote:
> Hi Shawn,
>
> I'm wondering if you're still seeing this. I've recently enabled
> task/cgroup on 17.11.5 running on CentOS 7 and just discovered that jobs
> are escaping their cgroups. For me this is res
Hi Shawn,
I'm wondering if you're still seeing this. I've recently enabled
task/cgroup on 17.11.5 running on CentOS 7 and just discovered that jobs
are escaping their cgroups. For me this is resulting in a lot of jobs
ending in OUT_OF_MEMORY that shouldn't, because it appears slurmd thinks
the oom