Hello: I have been using LAM-MPI for many years on PC/Linux systems and have been quite pleased with its performance. However, at the urging of the LAM-MPI website, I have decided to switch to OPENMPI. For much of mypreliminary testing I work on a single processor workstation (see the attached 'config.log' and ompi_info.log files for some of the specifics of my system). I
frequently run with more than one virtual mpi processor (i.e. oversubscribe the real processor) to test my code. With LAM the runtime penalty for this is usually insignificant for 2-4 virtual processors, but with OPENMPI it has been prohibitive. Below is a matrix of runtimes for a simple MPI matrix transpose code using mpi_sendrecv( I tried other variations of blocking/ non-blocking, synchronous/non-synchronous send/recv with similar results).
message size= 262144 bytes
LAM OPENMPI
1 proc: .02575 secs .02513 secs
2 proc: .04603 secs 10.069 secs
4 proc: .04903 secs 35.422 secs
I am pretty sure that LAM exploits the fact that the virtual processors
are all
sharing the same memory, so communication is via memory and/or the PCI bus of the system, while my OPENMPI configuration doesn't exploit this. Is this a reasonable diagnosis of the dramatic difference in performance? More importantly, how to I reconfigure OPENMPI to match the LAM performance. regards Tom Rosmond
Open MPI: 1.0.1r8453
Open MPI SVN revision: r8453
Open RTE: 1.0.1r8453
Open RTE SVN revision: r8453
OPAL: 1.0.1r8453
OPAL SVN revision: r8453
Prefix: /usr/local/openmpi
Configured architecture: i686-pc-linux-gnu
Configured by: root
Configured on: Tue Jan 3 19:50:56 PST 2006
Configure host: cedar.reachone.com
Built by: root
Built on: Tue Jan 3 20:13:01 PST 2006
Built host: cedar.reachone.com
C bindings: yes
C++ bindings: yes
Fortran77 bindings: yes (all)
Fortran90 bindings: yes
C compiler: gcc
C compiler absolute: /usr/bin/gcc
C++ compiler: g++
C++ compiler absolute: /usr/bin/g++
Fortran77 compiler: ifc
Fortran77 compiler abs: /opt/intel/compiler70/ia32/bin/ifc
Fortran90 compiler: ifc
Fortran90 compiler abs: /opt/intel/compiler70/ia32/bin/ifc
C profiling: yes
C++ profiling: yes
Fortran77 profiling: yes
Fortran90 profiling: yes
C++ exceptions: no
Thread support: posix (mpi: no, progress: no)
Internal debug support: no
MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
libltdl support: 1
MCA memory: malloc_hooks (MCA v1.0, API v1.0, Component v1.0.1)
MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.0.1)
MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.0.1)
MCA timer: linux (MCA v1.0, API v1.0, Component v1.0.1)
MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
MCA coll: basic (MCA v1.0, API v1.0, Component v1.0.1)
MCA coll: self (MCA v1.0, API v1.0, Component v1.0.1)
MCA coll: sm (MCA v1.0, API v1.0, Component v1.0.1)
MCA io: romio (MCA v1.0, API v1.0, Component v1.0.1)
MCA mpool: sm (MCA v1.0, API v1.0, Component v1.0.1)
MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.0.1)
MCA pml: teg (MCA v1.0, API v1.0, Component v1.0.1)
MCA ptl: self (MCA v1.0, API v1.0, Component v1.0.1)
MCA ptl: sm (MCA v1.0, API v1.0, Component v1.0.1)
MCA ptl: tcp (MCA v1.0, API v1.0, Component v1.0.1)
MCA btl: self (MCA v1.0, API v1.0, Component v1.0.1)
MCA btl: sm (MCA v1.0, API v1.0, Component v1.0.1)
MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
MCA topo: unity (MCA v1.0, API v1.0, Component v1.0.1)
MCA gpr: null (MCA v1.0, API v1.0, Component v1.0.1)
MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.0.1)
MCA gpr: replica (MCA v1.0, API v1.0, Component v1.0.1)
MCA iof: proxy (MCA v1.0, API v1.0, Component v1.0.1)
MCA iof: svc (MCA v1.0, API v1.0, Component v1.0.1)
MCA ns: proxy (MCA v1.0, API v1.0, Component v1.0.1)
MCA ns: replica (MCA v1.0, API v1.0, Component v1.0.1)
MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
MCA ras: dash_host (MCA v1.0, API v1.0, Component v1.0.1)
MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.0.1)
MCA ras: localhost (MCA v1.0, API v1.0, Component v1.0.1)
MCA ras: slurm (MCA v1.0, API v1.0, Component v1.0.1)
MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.0.1)
MCA rds: resfile (MCA v1.0, API v1.0, Component v1.0.1)
MCA rmaps: round_robin (MCA v1.0, API v1.0, Component v1.0.1)
MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.0.1)
MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.0.1)
MCA rml: oob (MCA v1.0, API v1.0, Component v1.0.1)
MCA pls: daemon (MCA v1.0, API v1.0, Component v1.0.1)
MCA pls: fork (MCA v1.0, API v1.0, Component v1.0.1)
MCA pls: proxy (MCA v1.0, API v1.0, Component v1.0.1)
MCA pls: rsh (MCA v1.0, API v1.0, Component v1.0.1)
MCA pls: slurm (MCA v1.0, API v1.0, Component v1.0.1)
MCA sds: env (MCA v1.0, API v1.0, Component v1.0.1)
MCA sds: pipe (MCA v1.0, API v1.0, Component v1.0.1)
MCA sds: seed (MCA v1.0, API v1.0, Component v1.0.1)
MCA sds: singleton (MCA v1.0, API v1.0, Component v1.0.1)
MCA sds: slurm (MCA v1.0, API v1.0, Component v1.0.1)
config.log.bz2
Description: BZip2 compressed data
