On 2018-09-07 18:53, Mike Cammilleri wrote:
Hi everyone,
I'm getting this error lately for everyone's jobs, which results in memory not
being constrained via the cgroups plugin.
slurmstepd: error: task/cgroup: unable to add task[pid=21681] to memory cg
'(null)'
slurmstepd: error: jobacct_gat
What do you have the OverSubscribe parameter set on the partition your using?
--
Brian D. Haymore
University of Utah
Center for High Performance Computing
155 South 1452 East RM 405
Salt Lake City, Ut 84112
Phone: 801-558-1150, Fax: 801-585-5366
http://bit.ly/1HO1N2C
I'm using the SLURM Elastic Compute feature and it works great in
general. However, I noticed that there's a bit of inefficiency in the
decision about the number of nodes which SLURM creates. Let's say I've
the following configuration
NodeName=compute-[1-100] CPUs=10 State=CLOUD
and there are non