Hi Jim, list

1) libnuma (non-uniform memory access, IIRR) is important if
you have AMD Opteron (our case here),
for processor and memory affinity, etc.
I suppose you don't need it with Intel Xeon (pre-Nehalem, at least),
but I am not positive about this (we don't have a Xeon cluster here).

2) tm is the Torque resource manager.
If you don't use Torque/PBS you don't have it, you don't need it.
Noam for instance uses sge, and configured with --use-sge.
If you just launch mpiexec directly you don't need
to build OpenMPI with any resource manager library or support.
(But resource managers are great!)

3) It may be that --enable-static puts the wrapper flags you don't have.
I don't know.  The OpenMPI developers may clarify this.

4) AFAIK, you need OFED on all nodes, at least on those
that have IB hardware, that are connected to your IB switch,
where you want to run MPI programs using IB.

I hope this helps.

Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

Jim Kress ORG wrote:
Well, the whole situation is really bizarre.

I just uninstalled openmpi 1.3.2 on my system.  Then I installed OFED
1.4.1 to see if that resolves this situation.

Here's what I get:

[root@master ~]# ompi_info --config
           Configured by: root
           Configured on: Wed Jun 24 11:10:00 EDT 2009
          Configure host: master.org
                Built by: root
                Built on: Wed Jun 24 11:13:22 EDT 2009
              Built host: master.org
              C bindings: yes
            C++ bindings: yes
      Fortran77 bindings: yes (all)
      Fortran90 bindings: yes
 Fortran90 bindings size: small
              C compiler: gcc
     C compiler absolute: /usr/bin/gcc
             C char size: 1
             C bool size: 1
            C short size: 2
              C int size: 4
             C long size: 8
            C float size: 4
           C double size: 8
          C pointer size: 8
            C char align: 1
            C bool align: 1
             C int align: 4
           C float align: 4
          C double align: 8
            C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
      Fortran77 compiler: gfortran
  Fortran77 compiler abs: /usr/bin/gfortran
      Fortran90 compiler: gfortran
  Fortran90 compiler abs: /usr/bin/gfortran
       Fort integer size: 4
       Fort logical size: 4
 Fort logical value true: 1
      Fort have integer1: yes
      Fort have integer2: yes
      Fort have integer4: yes
      Fort have integer8: yes
     Fort have integer16: no
         Fort have real4: yes
         Fort have real8: yes
        Fort have real16: no
      Fort have complex8: yes
     Fort have complex16: yes
     Fort have complex32: no
      Fort integer1 size: 1
      Fort integer2 size: 2
      Fort integer4 size: 4
      Fort integer8 size: 8
     Fort integer16 size: -1
          Fort real size: 4
         Fort real4 size: 4
         Fort real8 size: 8
        Fort real16 size: -1
      Fort dbl prec size: 4
          Fort cplx size: 4
      Fort dbl cplx size: 4
         Fort cplx8 size: 8
        Fort cplx16 size: 16
        Fort cplx32 size: -1
      Fort integer align: 4
     Fort integer1 align: 1
     Fort integer2 align: 2
     Fort integer4 align: 4
     Fort integer8 align: 8
    Fort integer16 align: -1
         Fort real align: 4
        Fort real4 align: 4
        Fort real8 align: 8
       Fort real16 align: -1
     Fort dbl prec align: 4
         Fort cplx align: 4
     Fort dbl cplx align: 4
        Fort cplx8 align: 4
       Fort cplx16 align: 8
       Fort cplx32 align: -1
             C profiling: yes
           C++ profiling: yes
     Fortran77 profiling: yes
     Fortran90 profiling: yes
          C++ exceptions: no
          Thread support: posix (mpi: no, progress: no)
           Sparse Groups: no
            Build CFLAGS: -DNDEBUG -O2 -g -pipe -Wall
-Wp,-D_FORTIFY_SOURCE=2
                          -fexceptions -fstack-protector
                          --param=ssp-buffer-size=4 -m64 -mtune=generic
                          -finline-functions -fno-strict-aliasing
-pthread
                          -fvisibility=hidden
          Build CXXFLAGS: -DNDEBUG -O2 -g -pipe -Wall
-Wp,-D_FORTIFY_SOURCE=2
                          -fexceptions -fstack-protector
                          --param=ssp-buffer-size=4 -m64 -mtune=generic
                          -finline-functions -pthread
            Build FFLAGS: -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
                          -fexceptions -fstack-protector
                          --param=ssp-buffer-size=4 -m64 -mtune=generic
           Build FCFLAGS: -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
                          -fexceptions -fstack-protector
                          --param=ssp-buffer-size=4 -m64 -mtune=generic
Build LDFLAGS: -export-dynamic Build LIBS: -lnsl -lutil -lm Wrapper extra CFLAGS: -pthread Wrapper extra CXXFLAGS: -pthread Wrapper extra FFLAGS: -pthread Wrapper extra FCFLAGS: -pthread Wrapper extra LDFLAGS: Wrapper extra LIBS: -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl Internal debug support: no
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
         libltdl support: yes
   Heterogeneous support: no
 mpirun default --prefix: yes
         MPI I/O support: yes
       MPI_WTIME support: gettimeofday
Symbol visibility support: yes
   FT Checkpoint support: no  (checkpoint thread: no)
[root@master ~]#
so you see, even the OFED 1.4.1 installation fails to put -libverbs etc
into openmpi.

Also, I think its

--enable-static

that is putting the -libverbs into your openmpi.  I'll try it and see
what happens.

What are libnuma and tm?  Do I need to worry about them?

Finally, I have forgotton what I do with all the RPMs OFED generates.
Do I install them all on my compute nodes or just a subset?

Thanks for the help.

Jim


On Wed, 2009-06-24 at 17:22 -0400, Gus Correa wrote:
Hi Jim


Jim Kress wrote:
 > Noam, Gus and List,
 >
 > Did you statically link your openmpi when you built it?  If you did (the
 > default is NOT to do this) then that could explain the discrepancy.
 >
 > Jim

No, I didn't link statically.

Did you link statically?

Actually, I tried to do it, and it didn't work.
I wouldn't get OpenMPI with IB if I tried to
link statically (i.e. by passing -static or equivalent to CFLAGS, FFLAGS, etc).
When I removed the "-static" I got OpenMPI with IB.
I always dump the configure output (and the make output, etc) to
log files to check these things out after it is done.
I really suggest you do this, it pays off, saves time, costs nothing.
I don't remember exactly what symptoms I found on the log,
whether the log definitely said that there was no IB support,
or if it didn't have the right flags (-libverbs, etc) like yours.
However, when I suppressed the "-static" from the compiler flags
then I've got all the IB goodies!  :)

Here is how I run configure (CFLAGS etc only have optimization flags,
no "-static"):

./configure \
--prefix=/my/directory \
--with-libnuma=/usr \
--with-tm=/usr \
--with-openib=/usr \
--enable-static \
2>&1 configure.log

Note, "--enable-static" means OpenMPI will build static libraries (besides the shared ones).
OpenMPI is not being linked statically to system libraries,
or to IB libraries, etc.

Some switches may not be needed,
in particularly the explicit use of /usr directory.
However, at some point the OpenMPI configure
would not work without being
told this (at least for libnuma).

BTW, I didn't claim your OpenMPI doesn't have IB support.
Not a categorical syllogism like
"you don't have the -libverbs flag, hence you don't have IB".
It is hard to make definitive statements like this
in a complex environment like this (OpenMPI build, parallel programs),
and with limited information via email.
After all, the list is peer reviewed! :)
Hence, I only guessed, as I usually do in these exchanges.
However, considering all the trouble you've been through, who knows,
maybe it was a guess in the right direction.

I wonder if there may still be a glitch in the OpenMPI configure
script, on how it searches for and uses libraries like IB, NUMA, etc,
which may be causing the problem.
Jeff:  Is this possible?

In any case, we have different "Wrapper extra LIBS".
I have -lrdmacm -libverbs, you and Noam don't have them.
(Noam: I am not saying you don't have IB support!  :))
My configure explicitly asks for ib support, Noam's (and maybe yours) doesn't.
Somehow, slight differences in how one invokes
the configure script seems to produce different results.

I hope this helps,
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------
-----Original Message-----
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Noam Bernstein
Sent: Wednesday, June 24, 2009 9:38 AM
To: Open MPI Users
Subject: Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2forcing all MPI traffic over Ethernet instead of using Infiniband


On Jun 23, 2009, at 6:19 PM, Gus Correa wrote:

Hi Jim, list

On my OpenMPI 1.3.2 ompi_info -config gives:

Wrapper extra LIBS: -lrdmacm -libverbs -ltorque -lnuma -ldl -Wl,-- export-dynamic -lnsl -lutil -lm -ldl

Yours doesn't seem to have the IB libraries: -lrdmacm -libverbs

So, I would guess your OpenMPI 1.3.2 build doesn't have IB support.
The second of these statements doesn't follow from the first.

My "ompi_info -config" returns

ompi_info -config | grep LIBS
               Build LIBS: -lnsl -lutil  -lm
Wrapper extra LIBS: -ldl -Wl,--export-dynamic -lnsl -lutil - lm -ldl

But it does have openib

ompi_info | grep openib
MCA btl: openib (MCA v2.0, API v2.0, Component v1.3.2)

and osu_bibw returns

# OSU MPI Bi-Directional Bandwidth Test v3.0
# Size     Bi-Bandwidth (MB/s)
4194304                1717.43

which it's sure not getting over ethernet. I think Jeff Squyres' test (ompi_info | grep openib) must be more definitive.

                                                                
                Noam
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to