Dear Jeff,

Thanks for the reply.

I will forward my question there.

Best Regards,
  Panos Labropoulos


On Thu, Oct 3, 2013 at 2:14 AM, Jeff Squyres (jsquyres)
<jsquy...@cisco.com>wrote:

> This seems to be a question about hwloc, not about Open MPI.
>
> To clarify, hwloc is a sub-project of Open MPI, but it has its own mailing
> list.  Would you mind re-directing your question over there?
>
>      http://www.open-mpi.org/community/lists/hwloc.php
>
> Thanks!
>
>
>
> On Oct 2, 2013, at 7:32 PM, Panos Labropoulos <
> panos.labropou...@brightcomputing.com> wrote:
>
> > Hallo,
> >
> > We seem to be unable to to set the cpu binding on a cluster consisting
> of Dell M420/M610 systems:
> >
> > [jallan@hpc21 ~]$ cat report-bindings.sh #!/bin/sh
> >
> > bitmap=`hwloc-bind --get -p`
> > friendly=`hwloc-calc -p -H socket.core.pu $bitmap`
> >
> > echo "MCW rank $OMPI_COMM_WORLD_RANK (`hostname`): $friendly"
> > exit 0
> >
> >
> > [jallan@hpc27 ~]$ hwloc-bind -v  socket:0.core:0 -l ./report-bindings.sh
> > using object #0 depth 2 below cpuset 0x000000ff
> > using object #0 depth 6 below cpuset 0x00000080
> > adding 0x00000080 to 0x0
> > adding 0x00000080 to 0x0
> > assuming the command starts at ./report-bindings.sh
> > binding on cpu set 0x00000080
> > MCW rank  (hpc27): Socket:0.Core:10.PU:7
> > [jallan@hpc27 ~]$ hwloc-bind -v  socket:1.core:0 -l ./report-bindings.sh
> > object #1 depth 2 (type socket) below cpuset 0x000000ff does not exist
> > adding 0x0 to 0x0
> > assuming the command starts at ./report-bindings.sh
> > MCW rank  (hpc27): Socket:0.Core:10.PU:7
> >
> >
> > The topology of this system looks a bit strange:
> >
> > [jallan@hpc21 ~]$ lstopo --no-io
> > Machine (24GB)
> >  NUMANode L#0 (P#0 24GB)
> >  NUMANode L#1 (P#1) + Socket L#0 + L3 L#0 (15MB) + L2 L#0 (256KB) + L1
> > L#0 (32KB) + Core L#0 + PU L#0 (P#11)
> > [jallan@hpc21 ~]$
> >
> >
> > Using Open MPI 1.4.4:
> >
> > http://pastebin.com/VsZS2q3R
> >
> > For some reason the binding cannot be set. We also tried Open MPI 1.6.5
> and 1.7.3 with similar results.
> >
> > This is the output from a local SMP system:
> >
> > [panos@demo ~]$ hwloc-bind -v  socket:1.core:0 -l ./report-bindings.sh
> using object #1 depth 2 below cpuset 0x00000003 using object #0 depth 6
> below cpuset 0x00000002 adding 0x00000002 to 0x0 adding 0x00000002 to 0x0
> assuming the command starts at ./report-bindings.sh binding on cpu set
> 0x00000002 MCW rank  (demo): Socket:1.Core:0.PU:1 [panos@demo ~]$
> hwloc-bind -v  socket:0.core:0 -l ./report-bindings.sh using object #0
> depth 2 below cpuset 0x00000003 using object #0 depth 6 below cpuset
> 0x00000001 adding 0x00000001 to 0x0 adding 0x00000001 to 0x0 assuming the
> command starts at ./report-bindings.sh binding on cpu set 0x00000001 MCW
> rank  (demo): Socket:0.Core:0.PU:0
> >
> >
> > The MPI binding output is formatted a bit different as this nodes runs
> Open MPI 1.6.5:
> >
> > [panos@demo ~]$ `which  mpiexec` --report-bindings --bind-to-core
> > --bycore -mca btl ^openib -np 4   -hostfile ./hplnodes2 -x
> > LD_LIBRARY_PATH -x PATH    /cm/shared/apps/hpl/2.1/xhpl
> > [demo:25615] MCW rank 0 bound to socket 0[core 0]: [B][.] [demo:25615]
> MCW rank 2 bound to socket 1[core 0]: [.][B] [node003:08454] MCW rank 1
> bound to socket 0[core 0]: [B .] [node003:08454] MCW rank 3 bound to socket
> 0[core 1]: [. B] [panos@demo ~]$ module load hwloc
> >
> >
> >
> > [panos@demo ~]$ lstopo -l
> > Machine (4095MB)
> >  NUMANode L#0 (P#0 2048MB) + Socket L#0 + L2 L#0 (1024KB) + L1d L#0
> > (64KB) + L1i L#0 (64KB) + Core L#0 + PU L#0 (P#0)
> >  NUMANode L#1 (P#1 2048MB) + Socket L#1 + L2 L#1 (1024KB) + L1d L#1
> > (64KB) + L1i L#1 (64KB) + Core L#1 + PU L#1 (P#1)
> >
> > Any help will be appreciated.
> >
> > Kind Regards,
> >   Panos Labropoulos
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to