Just an update: the cgroup.conf file could not be parsed when I added ConstrainKmemSpace=no. I guess this option is not compatible with our kernel/slurm versions on Ubuntu? Not sure. For now we took the lazy way out and rebooted nodes. Will try the kernel options or a full slurm update as time allows.
-----Original Message----- From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of Mike Cammilleri Sent: Monday, September 10, 2018 9:49 AM To: Slurm User Community List <slurm-users@lists.schedmd.com> Subject: Re: [slurm-users] can't create memory group (cgroup) Thanks everyone for your responses. It looks like the two suggestions were: 1. add "cgroup_enable=memory swapaccount=1" to the kernel command by adding it to /etc/default/grub in the GRUB_CMDLIND_LINUX variable 2. Add ConstrainKmemSpace=no in cgroup.conf >From this information I think option 2 is the least troublesome so we'll give >that a shot first. Changing the kernel options would be the second try I >suppose. Eventually we'll upgrade SLURM and OS versions but you know....when >things are functional and work is getting done.... its hard to justify during >an academic semester. --mike -----Original Message----- From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of Chris Samuel Sent: Monday, September 10, 2018 6:49 AM To: slurm-users@lists.schedmd.com Subject: Re: [slurm-users] can't create memory group (cgroup) On Monday, 10 September 2018 4:42:00 PM AEST Janne Blomqvist wrote: > One workaround is to reboot the node whenever this happens. Another > is to set ConstrainKmemSpace=no is cgroup.conf (but AFAICS this option > was added in slurm 17.02 and is not present in 16.05 that you're using). Phew, we had to set ConstrainKmemSpace=no to avoid breaking Intel Omnipath so looks like we dodged a bullet there. Nice work tracking it down! All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC