Hi,

is it possible to bind MPI processes to a NUMA node somehow on Opteron 6xxx series CPUs (e.g. --bind-to-NUMAnode) *without* the usage of a rankfile? Opteron 6xxx have two NUMA nodes per CPU(-socket) so --bind-to-socket doesn't work as I want.

This is a 4 socket Opteron 6344 (12 CPUs per CPU(-socket)):

root@node01:~> numactl --hardware | grep cpus
node 0 cpus: 0 1 2 3 4 5
node 1 cpus: 6 7 8 9 10 11
node 2 cpus: 12 13 14 15 16 17
node 3 cpus: 18 19 20 21 22 23
node 4 cpus: 24 25 26 27 28 29
node 5 cpus: 30 31 32 33 34 35
node 6 cpus: 36 37 38 39 40 41
node 7 cpus: 42 43 44 45 46 47

root@node01:~> /opt/openmpi/1.6.3/gcc/bin/mpirun --report-bindings -np 8 --bind-to-socket --bysocket sleep 1s [node01.cluster:21446] MCW rank 1 bound to socket 1[core 0-11]: [. . . . . . . . . . . .][B B B B B B B B B B B B][. . . . . . . . . . . .][. . . . . . . . . . . .] [node01.cluster:21446] MCW rank 2 bound to socket 2[core 0-11]: [. . . . . . . . . . . .][. . . . . . . . . . . .][B B B B B B B B B B B B][. . . . . . . . . . . .] [node01.cluster:21446] MCW rank 3 bound to socket 3[core 0-11]: [. . . . . . . . . . . .][. . . . . . . . . . . .][. . . . . . . . . . . .][B B B B B B B B B B B B] [node01.cluster:21446] MCW rank 4 bound to socket 0[core 0-11]: [B B B B B B B B B B B B][. . . . . . . . . . . .][. . . . . . . . . . . .][. . . . . . . . . . . .] [node01.cluster:21446] MCW rank 5 bound to socket 1[core 0-11]: [. . . . . . . . . . . .][B B B B B B B B B B B B][. . . . . . . . . . . .][. . . . . . . . . . . .] [node01.cluster:21446] MCW rank 6 bound to socket 2[core 0-11]: [. . . . . . . . . . . .][. . . . . . . . . . . .][B B B B B B B B B B B B][. . . . . . . . . . . .] [node01.cluster:21446] MCW rank 7 bound to socket 3[core 0-11]: [. . . . . . . . . . . .][. . . . . . . . . . . .][. . . . . . . . . . . .][B B B B B B B B B B B B] [node01.cluster:21446] MCW rank 0 bound to socket 0[core 0-11]: [B B B B B B B B B B B B][. . . . . . . . . . . .][. . . . . . . . . . . .][. . . . . . . . . . . .]

So each process is bound to *two* NUMA nodes, but I wan't to bind to *one* NUMA node.

What I want is more like this:
root@node01:~> cat rankfile
rank 0=localhost slot=0-5
rank 1=localhost slot=6-11
rank 2=localhost slot=12-17
rank 3=localhost slot=18-23
rank 4=localhost slot=24-29
rank 5=localhost slot=30-35
rank 6=localhost slot=36-41
rank 7=localhost slot=42-47
root@node01:~> /opt/openmpi/1.6.3/gcc/bin/mpirun --report-bindings -np 8 --rankfile rankfile sleep 1s [node01.cluster:21505] MCW rank 1 bound to socket 0[core 6-11]: [. . . . . . B B B B B B][. . . . . . . . . . . .][. . . . . . . . . . . .][. . . . . . . . . . . .] (slot list 6-11) [node01.cluster:21505] MCW rank 2 bound to socket 1[core 0-5]: [. . . . . . . . . . . .][B B B B B B . . . . . .][. . . . . . . . . . . .][. . . . . . . . . . . .] (slot list 12-17) [node01.cluster:21505] MCW rank 3 bound to socket 1[core 6-11]: [. . . . . . . . . . . .][. . . . . . B B B B B B][. . . . . . . . . . . .][. . . . . . . . . . . .] (slot list 18-23) [node01.cluster:21505] MCW rank 4 bound to socket 2[core 0-5]: [. . . . . . . . . . . .][. . . . . . . . . . . .][B B B B B B . . . . . .][. . . . . . . . . . . .] (slot list 24-29) [node01.cluster:21505] MCW rank 5 bound to socket 2[core 6-11]: [. . . . . . . . . . . .][. . . . . . . . . . . .][. . . . . . B B B B B B][. . . . . . . . . . . .] (slot list 30-35) [node01.cluster:21505] MCW rank 6 bound to socket 3[core 0-5]: [. . . . . . . . . . . .][. . . . . . . . . . . .][. . . . . . . . . . . .][B B B B B B . . . . . .] (slot list 36-41) [node01.cluster:21505] MCW rank 7 bound to socket 3[core 6-11]: [. . . . . . . . . . . .][. . . . . . . . . . . .][. . . . . . . . . . . .][. . . . . . B B B B B B] (slot list 42-47) [node01.cluster:21505] MCW rank 0 bound to socket 0[core 0-5]: [B B B B B B . . . . . .][. . . . . . . . . . . .][. . . . . . . . . . . .][. . . . . . . . . . . .] (slot list 0-5)


Actually I'm dreaming of
mpirun --bind-to-NUMAnode --bycore ...
or
mpirun --bind-to-NUMAnode --byNUMAnode ...

Is there any workaround execpt rankfiles for this?

Regards,
 Oliver Weihe

Reply via email to