Hi all,

I am having trouble with PGI on Mac OS X 10.6.6. PGI's support staff has informed me that PGI does not "support 64-bit shared library creation" on the Mac. Therefore, I have built Open MPI in static only mode (--disable-shared --enable-static).

I have to do some manipulation to get my application to pass the final linking stage (more on that at the bottom) but I get an immediate crash at runtime:


<<<<<<<<<<<<<<<<<<<<<<<< start of output
bash-3.2$ mpirun -np 4 oceanG ocean_upwelling.in
[flask.marine.rutgers.edu:14186] opal_ifinit: unable to find network interfaces. [flask.marine.rutgers.edu:14186] [[65522,0],0] ORTE_ERROR_LOG: Error in file ess_hnp_module.c at line 181
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_rml_base_select failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[flask.marine.rutgers.edu:14186] [[65522,0],0] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 132
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[flask.marine.rutgers.edu:14186] [[65522,0],0] ORTE_ERROR_LOG: Error in file orterun.c at line 543
>>>>>>>>>>>>>>>>>>>>>>>> end of output


When I google for this error the only result I find is for a patch to version 1.1.2 which doesn't even resemble the current state of the Open MPI code.

iMac info:

ProductName:    Mac OS X
ProductVersion: 10.6.6
BuildVersion:   10J567

Has anyone seen this before or have an idea what to try?

Thanks,
Dave

P.S. I get the same results with Open MPI configured with:

./configure --prefix=/opt/pgisoft/openmpi/openmpi-1.4.3 CC=pgcc CXX=pgcpp F77=pgf77 FC=pgf90 --enable-mpirun-prefix-by-default --disable-shared --enable-static --without-memory-manager --without-libnuma --disable-ipv6 --disable-io-romio --disable-heterogeneous --enable-mpi-f77 --enable-mpi-f90 --enable-mpi-profile

and

./configure --prefix=/opt/pgisoft/openmpi/openmpi-1.4.3 CC=pgcc CXX=pgcpp F77=pgf77 FC=pgf90 --disable-shared --enable-static



P.P.S. Linking workarounds:

Snow Leopard ships with Open MPI libraries that interfere when linking programs built with my compiled mpif90. The problem is that 'ld' searches every directory in the search path for shared objects before it will look for static archives. That means a line like:

pgf90 x.o -o a.out -L/opt/openmpi/lib -lmpi_f90 -lmpi_f77 -lmpi

will use the .a file in /opt/openmpi/lib because Snow Leopard doesn't ship with Fortran bindings but when it gets to -lmpi it picks up the libmpi.dylib from /usr/lib and causes undefined references. Note the line above is inferred using the -show:link option to mpif90.

I have found two workarounds to this. Edit the share/openmpi/mpif90-wrapper-data.txt file to have full paths to the static libraries (this is what the PGI shipped version of Open MPI does). The other option is to add the line:

switch -search_paths_first is replace(-search_paths_first) positional(linker);

to the /path/to/pgi/bin/siterc file and set LDFLAGS to -search_paths_first in my application.

from the ld manpage:

-search_paths_first
      By default the -lx and -weak-lx options first search for a file
      of the form `libx.dylib' in each directory in the library search
      path, then a file of the form `libx.a' is searched for in the
      library search paths.  This option changes it so that in each
      path `libx.dylib' is searched for then `libx.a' before the next
      path in the library search path is searched.

Reply via email to