Dear Slurm users, 

i would like to use the taskplugin cgroups on my system. I have configured
cgroups as recommended by slurm:

 

cgroup.conf:

CgroupAutomount=yes

CgroupMountpoint=/sys/fs/cgroup

TaskAffinity=no

ConstrainCores=yes

 

slurm.conf:

...

TaskPlugin=task/affinity,task/cgroup

#TaskPluginParam=Sched

...

# COMPUTE NODES

GresTypes=gpu

NodeName=lsm[216-217] Gres=gpu:tesla:1 CPUs=64 RealMemory=192073 Sockets=2
CoresPerSocket=16 ThreadsPerCore=2 State=UNKNOWN

PartitionName=admin Nodes=lsm[216-217] Default=YES MaxTime=INFINITE State=UP

 

It works, but the performance is much worse than before with
TaskPlugin=task/none.

 

What I noticed:

With TaskPlugin=task/none the tasks often change the execution unit
(thread). With affinity/cgroups the tasks remain on their assigned thread
from start to finish. 

 

As an example:

With srun -n9 -N2 ./prog and TaskPlugin=task/affinity,task/cgroup, the tasks
are divided among the threads: 

node 1: 1,17,33,49,34

Node 2: 1,17,33,49

Execution time: ~210 sec

 

With srun -n9 -N2 ./prog and TaskPlugin=task/none the taks change the
threads during execution.

execution time: ~140 sec (like manual with mpirun)

 

What are proper parameters for task affinity and cgroups?

 

Thanks for any help :)

-max

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to