Dear all, we have the same problem on RHEL 7.7 and Slurm 19.05.5. Can anybody of you help us to find a solution for that problem?
We now are using the parameter "SelectType=select/cons_res", do we may need the parameter "SelectType=select/cons_tres" instead?
Kind regards, Danny Rotscher Am 27.11.19 um 07:47 schrieb Uemoto, Tomoki:
Hi, all OS Version: RHEL 7.6 SLURM Version: slurm 18.08.6 I defined the gpu resource as follows: [test@ohpc137pbsop-c001 ~]$ scontrol show config |grep TaskPlugin TaskPlugin = task/cgroup TaskPluginParam = (null type) [test@ohpc137pbsop-c001 ~]$[test@ohpc137pbsop-c001 ~]$ grep Gres /etc/slurm/slurm.confGresTypes=gpu NodeName=ohpc137pbsop-c001 Sockets=2 CoresPerSocket=12 ThreadsPerCore=2 Gres=gpu:2 State=IDLE NodeName=ohpc137pbsop-c002 Sockets=2 CoresPerSocket=12 ThreadsPerCore=2 Gres=gpu:2 State=IDLE [test@ohpc137pbsop-c001 ~]$ [test@ohpc137pbsop-c001 ~]$ cat /etc/slurm/gres.conf Name=gpu File=/dev/tty0 Cores=0,1 Name=gpu File=/dev/tty1 Cores=0,1[test@ohpc137pbsop-c001 ~]$[root@ohpc137pbsop-sms ~]# cat /etc/slurm/cgroup.conf ### # # Slurm cgroup support configuration file # # See man slurm.conf and man cgroup.conf for further # information on cgroup configuration parameters #-- ConstrainCores=yes TaskAffinity=yes CgroupMountpoint=/cgroup CgroupAutomount=yes ConstrainRAMSpace=yes [root@ohpc137pbsop-sms ~]#[root@ohpc137pbsop-sms ~]# scontrol show node |grep GresGres=gpu:2 Gres=gpu:2 [root@ohpc137pbsop-sms ~]# And I executed the following script. [test@ohpc137pbsop-sms ~]$ srun -l --gres=gpu:2 -n4 --accel-bind=v,g -l hostname 0: ohpc137pbsop-c001 2: ohpc137pbsop-c002 1: ohpc137pbsop-c001 3: ohpc137pbsop-c002 [test@ohpc137pbsop-sms ~]$ srun -l --gres=gpu:2 -n4 --accel-bind=v -l hostname 2: ohpc137pbsop-c002 0: ohpc137pbsop-c001 3: ohpc137pbsop-c002 1: ohpc137pbsop-c001 [test@ohpc137pbsop-sms ~]$ Task binding information is not output. Is the verbose mode (of the accel-bind) not supported in this version(slurm 18.08.6)? The verbose mode of cpu-bind was confirmed as follows. [test@ohpc137pbsop-sms ~]$ srun -c1 --cpu-bind=v hostname cpu-bind=NULL - ohpc137pbsop-c001, task 0 0 [22822]: mask 0x1000001 ohpc137pbsop-c001 cpu-bind=NULL - ohpc137pbsop-c001, task 1 1 [22823]: mask 0x1000001 ohpc137pbsop-c001 [test@ohpc137pbsop-sms ~]$
-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Danny Rotscher HPC-Support Technische Universität Dresden Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH) 01062 Dresden Tel.: +49 351 463-35853 Fax : +49 351 463-37773 E-Mail: danny.rotsc...@tu-dresden.de ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
smime.p7s
Description: S/MIME Cryptographic Signature