Dear Jeff, Thanks for the reply.
I will forward my question there. Best Regards, Panos Labropoulos On Thu, Oct 3, 2013 at 2:14 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com>wrote: > This seems to be a question about hwloc, not about Open MPI. > > To clarify, hwloc is a sub-project of Open MPI, but it has its own mailing > list. Would you mind re-directing your question over there? > > http://www.open-mpi.org/community/lists/hwloc.php > > Thanks! > > > > On Oct 2, 2013, at 7:32 PM, Panos Labropoulos < > panos.labropou...@brightcomputing.com> wrote: > > > Hallo, > > > > We seem to be unable to to set the cpu binding on a cluster consisting > of Dell M420/M610 systems: > > > > [jallan@hpc21 ~]$ cat report-bindings.sh #!/bin/sh > > > > bitmap=`hwloc-bind --get -p` > > friendly=`hwloc-calc -p -H socket.core.pu $bitmap` > > > > echo "MCW rank $OMPI_COMM_WORLD_RANK (`hostname`): $friendly" > > exit 0 > > > > > > [jallan@hpc27 ~]$ hwloc-bind -v socket:0.core:0 -l ./report-bindings.sh > > using object #0 depth 2 below cpuset 0x000000ff > > using object #0 depth 6 below cpuset 0x00000080 > > adding 0x00000080 to 0x0 > > adding 0x00000080 to 0x0 > > assuming the command starts at ./report-bindings.sh > > binding on cpu set 0x00000080 > > MCW rank (hpc27): Socket:0.Core:10.PU:7 > > [jallan@hpc27 ~]$ hwloc-bind -v socket:1.core:0 -l ./report-bindings.sh > > object #1 depth 2 (type socket) below cpuset 0x000000ff does not exist > > adding 0x0 to 0x0 > > assuming the command starts at ./report-bindings.sh > > MCW rank (hpc27): Socket:0.Core:10.PU:7 > > > > > > The topology of this system looks a bit strange: > > > > [jallan@hpc21 ~]$ lstopo --no-io > > Machine (24GB) > > NUMANode L#0 (P#0 24GB) > > NUMANode L#1 (P#1) + Socket L#0 + L3 L#0 (15MB) + L2 L#0 (256KB) + L1 > > L#0 (32KB) + Core L#0 + PU L#0 (P#11) > > [jallan@hpc21 ~]$ > > > > > > Using Open MPI 1.4.4: > > > > http://pastebin.com/VsZS2q3R > > > > For some reason the binding cannot be set. We also tried Open MPI 1.6.5 > and 1.7.3 with similar results. > > > > This is the output from a local SMP system: > > > > [panos@demo ~]$ hwloc-bind -v socket:1.core:0 -l ./report-bindings.sh > using object #1 depth 2 below cpuset 0x00000003 using object #0 depth 6 > below cpuset 0x00000002 adding 0x00000002 to 0x0 adding 0x00000002 to 0x0 > assuming the command starts at ./report-bindings.sh binding on cpu set > 0x00000002 MCW rank (demo): Socket:1.Core:0.PU:1 [panos@demo ~]$ > hwloc-bind -v socket:0.core:0 -l ./report-bindings.sh using object #0 > depth 2 below cpuset 0x00000003 using object #0 depth 6 below cpuset > 0x00000001 adding 0x00000001 to 0x0 adding 0x00000001 to 0x0 assuming the > command starts at ./report-bindings.sh binding on cpu set 0x00000001 MCW > rank (demo): Socket:0.Core:0.PU:0 > > > > > > The MPI binding output is formatted a bit different as this nodes runs > Open MPI 1.6.5: > > > > [panos@demo ~]$ `which mpiexec` --report-bindings --bind-to-core > > --bycore -mca btl ^openib -np 4 -hostfile ./hplnodes2 -x > > LD_LIBRARY_PATH -x PATH /cm/shared/apps/hpl/2.1/xhpl > > [demo:25615] MCW rank 0 bound to socket 0[core 0]: [B][.] [demo:25615] > MCW rank 2 bound to socket 1[core 0]: [.][B] [node003:08454] MCW rank 1 > bound to socket 0[core 0]: [B .] [node003:08454] MCW rank 3 bound to socket > 0[core 1]: [. B] [panos@demo ~]$ module load hwloc > > > > > > > > [panos@demo ~]$ lstopo -l > > Machine (4095MB) > > NUMANode L#0 (P#0 2048MB) + Socket L#0 + L2 L#0 (1024KB) + L1d L#0 > > (64KB) + L1i L#0 (64KB) + Core L#0 + PU L#0 (P#0) > > NUMANode L#1 (P#1 2048MB) + Socket L#1 + L2 L#1 (1024KB) + L1d L#1 > > (64KB) + L1i L#1 (64KB) + Core L#1 + PU L#1 (P#1) > > > > Any help will be appreciated. > > > > Kind Regards, > > Panos Labropoulos > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >