Hola, Slurm is complicated software, and sometimes the docs can be dense - I'm looking for some clarification please.
We have a system set up with Threads as CPUs. 1 socket, 4 cores, 2 threads = 8 cpus I would like to implement CGroups because some of our users are quite happy to utilise all threads despite other users. We have TaskPlugin=task/cgroup and when testing I noticed that the # of threads/cpus being allocated was rounded up to the nearest even. I presume this was due to cgroups marking a core as a cpu, rather than a thread as a cpu. So I set TaskPluginParam=Threads, but slurm is still allowing the use of more threads than have been requested. In particular, I'm running this test: #!/bin/bash #SBATCH --nodes=1 #SBATCH --ntasks=3 stress-ng --cpu 5 --cpu-method all --io 5 --vm 1 --vm-bytes 1G --timeout 600s --quiet I was hoping that the cgroup would kill the job because of too many cpus, but that's not how stress-ng works I've discovered. Regardless, when running this, I noted that squeue shows I've been allocated 3 CPUs, but on the server itself, I'm seeing four cpus being used? What have I done wrong? Is it possible to have granular control at the thread level with cgroups? cheers L. ------ "The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic civics is the insistence that we cannot ignore the truth, nor should we panic about it. It is a shared consciousness that our institutions have failed and our ecosystem is collapsing, yet we are still here — and we are creative agents who can shape our destinies. Apocalyptic civics is the conviction that the only way out is through, and the only way through is together. " *Greg Bloom* @greggish https://twitter.com/greggish/status/873177525903609857
