Hi all,
I am having trouble with PGI on Mac OS X 10.6.6. PGI's support staff has
informed me that PGI does not "support 64-bit shared library creation"
on the Mac. Therefore, I have built Open MPI in static only mode
(--disable-shared --enable-static).
I have to do some manipulation to get my application to pass the final
linking stage (more on that at the bottom) but I get an immediate crash
at runtime:
<<<<<<<<<<<<<<<<<<<<<<<< start of output
bash-3.2$ mpirun -np 4 oceanG ocean_upwelling.in
[flask.marine.rutgers.edu:14186] opal_ifinit: unable to find network
interfaces.
[flask.marine.rutgers.edu:14186] [[65522,0],0] ORTE_ERROR_LOG: Error in
file ess_hnp_module.c at line 181
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
orte_rml_base_select failed
--> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[flask.marine.rutgers.edu:14186] [[65522,0],0] ORTE_ERROR_LOG: Error in
file runtime/orte_init.c at line 132
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
orte_ess_set_name failed
--> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[flask.marine.rutgers.edu:14186] [[65522,0],0] ORTE_ERROR_LOG: Error in
file orterun.c at line 543
>>>>>>>>>>>>>>>>>>>>>>>> end of output
When I google for this error the only result I find is for a patch to
version 1.1.2 which doesn't even resemble the current state of the Open
MPI code.
iMac info:
ProductName: Mac OS X
ProductVersion: 10.6.6
BuildVersion: 10J567
Has anyone seen this before or have an idea what to try?
Thanks,
Dave
P.S. I get the same results with Open MPI configured with:
./configure --prefix=/opt/pgisoft/openmpi/openmpi-1.4.3 CC=pgcc
CXX=pgcpp F77=pgf77 FC=pgf90 --enable-mpirun-prefix-by-default
--disable-shared --enable-static --without-memory-manager
--without-libnuma --disable-ipv6 --disable-io-romio
--disable-heterogeneous --enable-mpi-f77 --enable-mpi-f90
--enable-mpi-profile
and
./configure --prefix=/opt/pgisoft/openmpi/openmpi-1.4.3 CC=pgcc
CXX=pgcpp F77=pgf77 FC=pgf90 --disable-shared --enable-static
P.P.S. Linking workarounds:
Snow Leopard ships with Open MPI libraries that interfere when linking
programs built with my compiled mpif90. The problem is that 'ld'
searches every directory in the search path for shared objects before it
will look for static archives. That means a line like:
pgf90 x.o -o a.out -L/opt/openmpi/lib -lmpi_f90 -lmpi_f77 -lmpi
will use the .a file in /opt/openmpi/lib because Snow Leopard doesn't
ship with Fortran bindings but when it gets to -lmpi it picks up the
libmpi.dylib from /usr/lib and causes undefined references. Note the
line above is inferred using the -show:link option to mpif90.
I have found two workarounds to this. Edit the
share/openmpi/mpif90-wrapper-data.txt file to have full paths to the
static libraries (this is what the PGI shipped version of Open MPI
does). The other option is to add the line:
switch -search_paths_first is replace(-search_paths_first)
positional(linker);
to the /path/to/pgi/bin/siterc file and set LDFLAGS to
-search_paths_first in my application.
from the ld manpage:
-search_paths_first
By default the -lx and -weak-lx options first search for a file
of the form `libx.dylib' in each directory in the library search
path, then a file of the form `libx.a' is searched for in the
library search paths. This option changes it so that in each
path `libx.dylib' is searched for then `libx.a' before the next
path in the library search path is searched.