On Apr 9, 2007, at 12:36 PM, Brian Barrett wrote:

On Apr 6, 2007, at 7:22 AM, Werner Van Geit wrote:

In our lab we are installing OpenMPI onto our Apple cluster computer. The cluster contains a couple of PowerPC G5 nodes and the new Intel Xeon Xserves, all with a clean install of Mac OS X Server 10.4.8 , Xcode 2.4.1 and Sun Grid Engine 6 (so we're not using XGrid). Since we want to make it 1 big cluster, we need Universal Binaries of OpenMPI.

We have been using the script buildpackage.sh available from the contrib/dist/macosx-pkg from the openmpi-1.2.tar.gz file. If we run this script on an Intel machine, we get fat binaries (checked this with the command line "file" command) that run on the Intel machine (we used "ompi_info" as the test), but not on the G5 machines. While running ompi_info) we get an error:

dyld: Symbol not found: _lt_libltdlc_LTX_preloaded_symbols
  Referenced from: /tmp/werner/mpi/lib/libopen-pal.0.dylib
  Expected in: flat namespace

Trace/BPT trap

I've managed to track this down a little, and there is definitely something wrong with the build of libltdl when we cross-compile on Darwin. I've looked a little bit, but haven't had any great success in determining what is going on. I'm going to start working with the Libtool developers to try to sort this one out, but it might take some time. In the mean time, you can build with either static or shared libraries, but disabling the dlopen() code with the argument --disable-dlopen (which you already discovered). Using shared libraries instead of static should still work for you, but make adding interconnects or rebuilding Open MPI a bit easier on your users.

I've filed a bug in Trac for the problem. You should get updates on the bug via e-mail and anyone else can follow along at:

    https://svn.open-mpi.org/trac/ompi/ticket/980

A little digging and I've found the problem. When building for architectures other than the build host (which is always either ppc or i386), Autoconf believes that we are cross-compiling. It really wants to prepend the target architecture to programs like gcc, g++, nm, ranlib, etc. It will generally figure out not to do that (since the system gcc is always a cross-compiler to any of the four platforms on OS X), but it appears that's not the case with one libtool test for nm. This leads to a series of mistakes on the part of Libtool. The short solution is to specify NM="nm -p" when cross- compiling on OS X. I've updated the build script for v1.2 here:

https://svn.open-mpi.org/trac/ompi/browser/trunk/contrib/dist/ macosx-pkg/buildpackage.sh?rev=14276

I'll make sure the updated script is part of the v1.2.1 release. The updates include some fixes to be a little smarter about what binaries to build and workarounds for various problems we've found in our x86_64 support and the NM issue.

Brian

Reply via email to