Hi all,
Sorry for the long Layoff on this but I have been dealing with other
issues. I'll report what I've learned then answer Jeff's questions.
First of all Doug Reeder is correct in saying the Apple supplied Open
MPI is conflicting. Furthermore, it conflicts with both shared and
static libraries.
From ld man page:
By default the -lx and -weak-lx options first search for a file
of the form `libx.dylib' in each directory in the library search
path, then a file of the form `libx.a' is searched for in the
library search paths.
This fact is the reason for my compile and/or runtime time errors with
static libraries and the need to add my Open MPI lib directory to the
DYLD_LIBRARY_PATH to avoid freezes with dynamic libraries. The shared
case is obvious because clearly you want you executable to load the MPI
libraries you compiled and not the ones that ship with Mac.
The static failures are more telling. Near the end of the Open MPI
configure script, the *-wrapper-data.txt files are created. Even if you
tell Open MPI you only want static libraries (--disable-shared
--enable-static), the libs variable in these files still use the -lx form.
For shared libraries this isn't an issue at compile time because the
wrapper makes sure your shared libraries are found first via the -L
option. At runtime, however, you still need your compiled Open MPI lib
path set in DYLD_LIBRARY_PATH.
However, if you have compiled static only libraries it doesn't matter
where in the search path you put your compiled lib directory it will
pick up the system's libmpi.dylib before it finds your libmpi.a at
compile time. This can result in undefined references with some
compilers. Even without undefined reference errors you have still linked
in the wrong symbols leading to runtime freezes. Note that libmpi_f90.a
and libmpi_f77.a are unaffected because Mac OS X 10.6 does not ship its
Open MPI with Fortran libs.
I have found 2 solutions to this. one is to replace the -lx occurrences
in share/openmpi/*-wrapper-data.txt with the full paths to the static
libraries. i.e. changing:
libs=-lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal -lutil
to:
libs=/path/to/openmpi/lib/libmpi_f90.a /path/to/openmpi/lib/libmpi_f77.a
/path/to/openmpi/lib/libmpi.a /path/to/openmpi/lib/libopen-rte.a
/path/to/openmpi/lib/libopen-pal.a -lutil
The second solution is to set linker flags for my application builds to
'-search_paths_first'. This will force ld to search each path for .dylib
and .a files before moving on to the next path in the search order. I
don't know how many linkers will listen to this flag though.
I would think having configure use full paths when static only libraries
are built would be the best solution, but perhaps their is some other
intricacy that will not allow this.
Now to answer Jeff's Questions:
---- Quote ----
Hello,
You may me bumping into conflicts between the apple supplied ompi and
your mpi. I use modules to force my mpi to the front of the PATH and
DYLD_LIBRARY_PATH variables.
Doug Reeder
On Dec 15, 2010, at 5:22 PM, Jeff Squyres wrote:
Sorry for the ginormous delay in replying here; I blame SC'10,
Thanksgiving, and the MPI Forum meeting last week...
On Nov 29, 2010, at 2:12 PM, David Robertson wrote:
I'm noticing a strange problem with Open MPI 1.4.2 on Mac OS X 10.6.
We use both Intel Ifort 11.1 and gfortran 4.3 on the same machine and
switch between them to test and debug code.
I had runtime problems when I compiled openmpi in my usual way of no
shared libraries so I switched to shared and it runs now.
What problems did you have? OMPI should work fine when compiled
statically.
As described above the problem turned out to be that the system supplied
shared libraries were getting mixed with the Fortran libraries I
compiled and caused the MPI to hang close to initialization.
However, in order for it to work with ifort I ended up needing to
add the location of my intel compiled Open MPI libraries
(/opt/intelsoft/openmpi/lib) to my DYLD_LIBRARY_PATH environment
variable to to get codes to compile and/or run with ifort.
Is this what Intel recommends for anything compiled with ifort on OS
X, or is this unique to OMPI-compiled MPI applications?
Exclusive to OMPI-compiled apps as far as I know.
The problem is that adding /opt/intelsoft/openmpi/lib to
DYLD_LIBRARY_PATH broke my Open MPI for gfortran. Now when I try to
compile with mpif90 for gfortran it thinks it's actually trying to
compile with ifort still. As soon as I take the above path out of
DYLD_LIBRARY_PATH everything works fine.
Also, when I run ompi_info everything looks right except prefix. It
says /opt/intelsoft/openmpi rather than /opt/gfortransoft/openmpi like
it should. It should be noted that having /opt/intelsoft/openmpi in
LD_LIBRARY_PATH does not produce the same effect.
I'm not quite clear on your setup, but it *sounds* like you're
somehow mixing up 2 different installations of OMPI -- one in
/opt/intelsoft and the other in /opt/gfortransoft.
Can you verify that you're using the "right" mpif77 (and friends)
when you intend to, and so on?
Yes, I am positive that my path and everything was right and I was using
the right mpif90. I don't know how ompi_info works but it appears that
it reads the wrapper executable and/or executes a which mpif90 etc. to
get the names of compilers and full paths. However, it seems to use the
shared library search path to determine what to report for the "prefix".
When I execute the actual wrappers
(/opt/gfortransoft/openmpi/bin/mpif90) it uses ifort as the compiler.
But, if I take /opt/intelsoft/openmpi/lib out of DYLD_LIBRARY_PATH,
/opt/gfortransoft/openmpi/bin/mpif90 uses gfortran as the compiler.
Also, with /opt/intelsoft/openmpi/lib removed from DYLD_LIBRARY_PATH
ompi_info reports the correct prefix.
Dave