Dear Slurm users, i would like to use the taskplugin cgroups on my system. I have configured cgroups as recommended by slurm:
cgroup.conf: CgroupAutomount=yes CgroupMountpoint=/sys/fs/cgroup TaskAffinity=no ConstrainCores=yes slurm.conf: ... TaskPlugin=task/affinity,task/cgroup #TaskPluginParam=Sched ... # COMPUTE NODES GresTypes=gpu NodeName=lsm[216-217] Gres=gpu:tesla:1 CPUs=64 RealMemory=192073 Sockets=2 CoresPerSocket=16 ThreadsPerCore=2 State=UNKNOWN PartitionName=admin Nodes=lsm[216-217] Default=YES MaxTime=INFINITE State=UP It works, but the performance is much worse than before with TaskPlugin=task/none. What I noticed: With TaskPlugin=task/none the tasks often change the execution unit (thread). With affinity/cgroups the tasks remain on their assigned thread from start to finish. As an example: With srun -n9 -N2 ./prog and TaskPlugin=task/affinity,task/cgroup, the tasks are divided among the threads: node 1: 1,17,33,49,34 Node 2: 1,17,33,49 Execution time: ~210 sec With srun -n9 -N2 ./prog and TaskPlugin=task/none the taks change the threads during execution. execution time: ~140 sec (like manual with mpirun) What are proper parameters for task affinity and cgroups? Thanks for any help :) -max
smime.p7s
Description: S/MIME cryptographic signature