[slurm-users] Elastic Compute
I'm using the SLURM Elastic Compute feature and it works great in general. However, I noticed that there's a bit of inefficiency in the decision about the number of nodes which SLURM creates. Let's say I've the following configuration NodeName=compute-[1-100] CPUs=10 State=CLOUD and there are none of these nodes up and running. Let's further say that I create 10 identical jobs and submit them at the same time using sbatch --nodes=1 --ntasks-per-node=1 I expected that SLURM finds out that 10 CPUs are required in total to serve the requirements for all jobs and, thus, creates a single compute node. However, SLURM triggers the creation of one node per job, i.e., 10 nodes are created. When the first of these ten nodes is ready to accept jobs, SLURM assigns all of the 10 submitted jobs to this single node, though. The other nine nodes which were created are running idle and are terminated again after a while. I'm using "SelectType=select/cons_res" to schedule on the CPU level. Is there some knob which influences this behavior or is this behavior hard-coded?
Re: [slurm-users] Elastic Compute
What do you have the OverSubscribe parameter set on the partition your using? -- Brian D. Haymore University of Utah Center for High Performance Computing 155 South 1452 East RM 405 Salt Lake City, Ut 84112 Phone: 801-558-1150, Fax: 801-585-5366 http://bit.ly/1HO1N2C From: slurm-users [slurm-users-boun...@lists.schedmd.com] on behalf of Felix Wolfheimer [f.wolfhei...@googlemail.com] Sent: Sunday, September 09, 2018 1:35 PM To: slurm-users@lists.schedmd.com Subject: [slurm-users] Elastic Compute I'm using the SLURM Elastic Compute feature and it works great in general. However, I noticed that there's a bit of inefficiency in the decision about the number of nodes which SLURM creates. Let's say I've the following configuration NodeName=compute-[1-100] CPUs=10 State=CLOUD and there are none of these nodes up and running. Let's further say that I create 10 identical jobs and submit them at the same time using sbatch --nodes=1 --ntasks-per-node=1 I expected that SLURM finds out that 10 CPUs are required in total to serve the requirements for all jobs and, thus, creates a single compute node. However, SLURM triggers the creation of one node per job, i.e., 10 nodes are created. When the first of these ten nodes is ready to accept jobs, SLURM assigns all of the 10 submitted jobs to this single node, though. The other nine nodes which were created are running idle and are terminated again after a while. I'm using "SelectType=select/cons_res" to schedule on the CPU level. Is there some knob which influences this behavior or is this behavior hard-coded?
Re: [slurm-users] can't create memory group (cgroup)
On 2018-09-07 18:53, Mike Cammilleri wrote: Hi everyone, I'm getting this error lately for everyone's jobs, which results in memory not being constrained via the cgroups plugin. slurmstepd: error: task/cgroup: unable to add task[pid=21681] to memory cg '(null)' slurmstepd: error: jobacct_gather/cgroup: unable to instanciate user 3691 memory cgroup The result is that no uid_ direcotries are created under /sys/fs/cgroup/memory Here is our cgroup.conf file: CgroupAutomount=yes CgroupReleaseAgentDir="/etc/cgroup" CgroupMountpoint=/sys/fs/cgroup ConstrainCores=yes ConstrainDevices=no ConstrainRAMSpace=yes ConstrainSwapSpace=yes AllowedSwapSpace=0 We are using jobacct_gather/cgroup # ACCOUNTING JobAcctGatherType=jobacct_gather/cgroup The partition is configured like this PartitionName=long Nodes=marzano[05-13] PriorityTier=30 Default=NO MaxTime=5-0 State=UP OverSubscribe=FORCE:1 We are using slurm 16.05.6 on Ubuntu 14.04 LTS Any ideas how to get cgroups going again? This is, apparently, a bug in the Linux kernel where it doesn't garbage collect deleted memory cgroups. Eventually the kernel hits an internal limit on how many memory cgroups there can be, and refuses to create more. This bug has apparently been fixed in the upstream kernel, but is still present at least in the CentOS 7 kernel, and based on your report, in the Ubuntu 14.04 kernel. One workaround is to reboot the node whenever this happens. Another is to set ConstrainKmemSpace=no is cgroup.conf (but AFAICS this option was added in slurm 17.02 and is not present in 16.05 that you're using). For more information, see discussion and links in slurm bug #5082. -- Janne Blomqvist, D.Sc. (Tech.), Scientific Computing Specialist Aalto University School of Science, PHYS & NBE +358503841576 || janne.blomqv...@aalto.fi