On Apr 9, 2007, at 12:36 PM, Brian Barrett wrote:
On Apr 6, 2007, at 7:22 AM, Werner Van Geit wrote:
In our lab we are installing OpenMPI onto our Apple cluster
computer. The cluster contains a couple of PowerPC G5 nodes and
the new Intel Xeon Xserves, all with a clean install of Mac OS X
Server 10.4.8 , Xcode 2.4.1 and Sun Grid Engine 6 (so we're not
using XGrid). Since we want to make it 1 big cluster, we need
Universal Binaries of OpenMPI.
We have been using the script buildpackage.sh available from the
contrib/dist/macosx-pkg from the openmpi-1.2.tar.gz file. If we
run this script on an Intel machine, we get fat binaries (checked
this with the command line "file" command) that run on the Intel
machine (we used "ompi_info" as the test), but not on the G5
machines. While running ompi_info) we get an error:
dyld: Symbol not found: _lt_libltdlc_LTX_preloaded_symbols
Referenced from: /tmp/werner/mpi/lib/libopen-pal.0.dylib
Expected in: flat namespace
Trace/BPT trap
I've managed to track this down a little, and there is definitely
something wrong with the build of libltdl when we cross-compile on
Darwin. I've looked a little bit, but haven't had any great
success in determining what is going on. I'm going to start
working with the Libtool developers to try to sort this one out,
but it might take some time. In the mean time, you can build with
either static or shared libraries, but disabling the dlopen() code
with the argument --disable-dlopen (which you already discovered).
Using shared libraries instead of static should still work for you,
but make adding interconnects or rebuilding Open MPI a bit easier
on your users.
I've filed a bug in Trac for the problem. You should get updates
on the bug via e-mail and anyone else can follow along at:
https://svn.open-mpi.org/trac/ompi/ticket/980
A little digging and I've found the problem. When building for
architectures other than the build host (which is always either ppc
or i386), Autoconf believes that we are cross-compiling. It really
wants to prepend the target architecture to programs like gcc, g++,
nm, ranlib, etc. It will generally figure out not to do that (since
the system gcc is always a cross-compiler to any of the four
platforms on OS X), but it appears that's not the case with one
libtool test for nm. This leads to a series of mistakes on the part
of Libtool. The short solution is to specify NM="nm -p" when cross-
compiling on OS X. I've updated the build script for v1.2 here:
https://svn.open-mpi.org/trac/ompi/browser/trunk/contrib/dist/
macosx-pkg/buildpackage.sh?rev=14276
I'll make sure the updated script is part of the v1.2.1 release. The
updates include some fixes to be a little smarter about what binaries
to build and workarounds for various problems we've found in our
x86_64 support and the NM issue.
Brian