On Mar 1, 2011, at 1:34 PM, David Robertson wrote:

> Hi,
> 
> > Error means OMPI didn't find a network interface - do you have your
> > networks turned off? Sometimes people travel with Airport turned off.
> > If you haven wire connected, then no interfaces exist.
> 
> I am logged in to the machine remotely through the wired interface. The 
> Airport is always off. I have Open MPI built and running fine with gcc/ifort 
> and gcc/gfortran using shared libraries. I have compiled and run successfully 
> with both shared and static libraries with gcc/ifort. I have not tried the 
> static libraries with gfortran/gcc.
> 
> ifconfig gives me:
> 
> lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
>        inet6 ::1 prefixlen 128
>        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
>        inet 127.0.0.1 netmask 0xff000000
> gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
> stf0: flags=0<> mtu 1280
> en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>        ether 10:9a:dd:55:bb:52
>        inet6 fe80::129a:ddff:fe55:bb52%en0 prefixlen 64 scopeid 0x4
>        inet 192.168.30.13 netmask 0xffffc000 broadcast 192.168.63.255
>        media: autoselect (1000baseT <full-duplex>)
>        status: active
> fw0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 4078
>        lladdr 70:cd:60:ff:fe:2f:01:8e
>        media: autoselect <full-duplex>
>        status: inactive
> en1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>        ether c8:bc:c8:c9:fc:a9
>        media: autoselect (<unknown type>)
>        status: inactive
> vnic0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>        ether 00:1c:42:00:00:08
>        inet 10.211.55.2 netmask 0xffffff00 broadcast 10.211.55.255
>        media: autoselect
>        status: active
> vnic1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>        ether 00:1c:42:00:00:09
>        inet 10.37.129.2 netmask 0xffffff00 broadcast 10.37.129.255
>        media: autoselect
>        status: active
> vboxnet0: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>        ether 0a:00:27:00:00:00
> 
> Are you saying that Open MPI is only looking for the Airport (en1) card and 
> not en0?

No, it isn't. However, what the error message says is as I indicated - it is 
failing because it is getting an error when trying to open a port on an 
available network. I can't debug your network to find out why. I know that Mac 
doesn't really like (nor does Apple really support) static builds, and it has 
been a long time since I have built it that way on my Mac. Looking at my old 
static config file, I don't see anything special in it.

That said, I know we had some early problems with static builds on the Mac 
(like I said, Apple doesn't really support it). Those were solved, though, and 
none of those problems had this symptom.

Could be something strange about PGI and socket libs when running static, but I 
wouldn't know - I don't use PGI.

Sorry I can't be of help - I suggest asking PGI about issues re socket support 
with their compiler on the Mac, or not using PGI if they only support static 
builds given Apple's lack of support for that mode of operation on the Mac 
(seems bizarre that PGI would demand it).


> Why would it do that for PGI only?


It doesn't, nor does it care what compiler is used.

> 
> Thanks,
> Dave
> 
> 
> On Mar 1, 2011, at 11:50 AM, David Robertson <robertson_at_[hidden]> wrote:
> 
> > Hi all,
> >
> > I am having trouble with PGI on Mac OS X 10.6.6. PGI's support staff has 
> > informed me that PGI does not "support 64-bit shared library creation" on 
> > the Mac. Therefore, I have built Open MPI in static only mode 
> > (--disable-shared --enable-static).
> >
> > I have to do some manipulation to get my application to pass the final 
> > linking stage (more on that at the bottom) but I get an immediate crash at 
> > runtime:
> >
> >
> > <<<<<<<<<<<<<<<<<<<<<<<< start of output
> > bash-3.2$ mpirun -np 4 oceanG ocean_upwelling.in
> > [flask.marine.rutgers.edu:14186] opal_ifinit: unable to find network 
> > interfaces.
> > [flask.marine.rutgers.edu:14186] [[65522,0],0] ORTE_ERROR_LOG: Error in 
> > file ess_hnp_module.c at line 181
> > --------------------------------------------------------------------------
> > It looks like orte_init failed for some reason; your parallel process is
> > likely to abort. There are many reasons that a parallel process can
> > fail during orte_init; some of which are due to configuration or
> > environment problems. This failure appears to be an internal failure;
> > here's some additional information (which may only be relevant to an
> > Open MPI developer):
> >
> > orte_rml_base_select failed
> > --> Returned value Error (-1) instead of ORTE_SUCCESS
> > --------------------------------------------------------------------------
> > [flask.marine.rutgers.edu:14186] [[65522,0],0] ORTE_ERROR_LOG: Error in 
> > file runtime/orte_init.c at line 132
> > --------------------------------------------------------------------------
> > It looks like orte_init failed for some reason; your parallel process is
> > likely to abort. There are many reasons that a parallel process can
> > fail during orte_init; some of which are due to configuration or
> > environment problems. This failure appears to be an internal failure;
> > here's some additional information (which may only be relevant to an
> > Open MPI developer):
> >
> > orte_ess_set_name failed
> > --> Returned value Error (-1) instead of ORTE_SUCCESS
> > --------------------------------------------------------------------------
> > [flask.marine.rutgers.edu:14186] [[65522,0],0] ORTE_ERROR_LOG: Error in 
> > file orterun.c at line 543
> > >>>>>>>>>>>>>>>>>>>>>>>> end of output
> >
> >
> > When I google for this error the only result I find is for a patch to 
> > version 1.1.2 which doesn't even resemble the current state of the Open MPI 
> > code.
> >
> > iMac info:
> >
> > ProductName: Mac OS X
> > ProductVersion: 10.6.6
> > BuildVersion: 10J567
> >
> > Has anyone seen this before or have an idea what to try?
> >
> > Thanks,
> > Dave
> >
> > P.S. I get the same results with Open MPI configured with:
> >
> > ./configure --prefix=/opt/pgisoft/openmpi/openmpi-1.4.3 CC=pgcc CXX=pgcpp 
> > F77=pgf77 FC=pgf90 --enable-mpirun-prefix-by-default --disable-shared 
> > --enable-static --without-memory-manager --without-libnuma --disable-ipv6 
> > --disable-io-romio --disable-heterogeneous --enable-mpi-f77 
> > --enable-mpi-f90 --enable-mpi-profile
> >
> > and
> >
> > ./configure --prefix=/opt/pgisoft/openmpi/openmpi-1.4.3 CC=pgcc CXX=pgcpp 
> > F77=pgf77 FC=pgf90 --disable-shared --enable-static
> >
> >
> >
> > P.P.S. Linking workarounds:
> >
> > Snow Leopard ships with Open MPI libraries that interfere when linking 
> > programs built with my compiled mpif90. The problem is that 'ld' searches 
> > every directory in the search path for shared objects before it will look 
> > for static archives. That means a line like:
> >
> > pgf90 x.o -o a.out -L/opt/openmpi/lib -lmpi_f90 -lmpi_f77 -lmpi
> >
> > will use the .a file in /opt/openmpi/lib because Snow Leopard doesn't ship 
> > with Fortran bindings but when it gets to -lmpi it picks up the 
> > libmpi.dylib from /usr/lib and causes undefined references. Note the line 
> > above is inferred using the -show:link option to mpif90.
> >
> > I have found two workarounds to this. Edit the 
> > share/openmpi/mpif90-wrapper-data.txt file to have full paths to the static 
> > libraries (this is what the PGI shipped version of Open MPI does). The 
> > other option is to add the line:
> >
> > switch -search_paths_first is replace(-search_paths_first) 
> > positional(linker);
> >
> > to the /path/to/pgi/bin/siterc file and set LDFLAGS to -search_paths_first 
> > in my application.
> >
> > from the ld manpage:
> >
> > -search_paths_first
> > By default the -lx and -weak-lx options first search for a file
> > of the form `libx.dylib' in each directory in the library search
> > path, then a file of the form `libx.a' is searched for in the
> > library search paths. This option changes it so that in each
> > path `libx.dylib' is searched for then `libx.a' before the next
> > path in the library search path is searched.
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to