Thanks Jeff.
It turns out that all our IB blades are EM64T - it's just that some have
i686 OS's and some x86_64 OS's. So I think we'll move to all x86_64
installs on IB hosts. I guess if we make the OpenMPI a 32-bit build, and
link against 32-bit IB drivers (my interpretation of the release notes
is that this is supported by the TopSpin driversfor EM64T), then the
same application could run on any host i686 or x86_64. Can this be done
with the OFED drivers? I assume that OpenMPI doesn't handle the same
MPI_COMM_WORLD with different interconnects (TCP vs.IB) - is that right?
Cheers,
Aaron
Jeff Squyres wrote:
On Dec 5, 2006, at 7:12 PM, Aaron McDonough wrote:
We have a mix of i686 and x86_64 SLES9 nodes, some with IB interfaces
and some without. Ideally, we want users to be able to run the same
binary on any node. Can I build a common OpenMPI for both platforms
that
will work with either 32 or 64 bit IB drivers (Topspin)? Or do I
have to
use all 32bit IB drivers? Any advice is appreciated.
A few things:
1. OMPI's heterogeneity support in the 1.1 series has some known
issues (e.g., mixing 32 and 64 bit executables in a single
MPI_COMM_WORLD). The development head (and therefore the upcoming
1.2 series) has many bug fixes in this area (but is still not perfect
-- there's a few open tickets about heterogeneity that are actively
being worked on).
2. In theory, using OMPI with 32 bit IB support in the same
MPI_COMM_WORLD with OMPI with 64 bit IB support *should* be ok, but I
don't know if anyone has tested this configuration (and subject to
the constraints of #1). Brad/IBM -- have you guys done so?
3. Depending on your needs, Cisco is recommending moving away from
the Topspin drivers and to the OFED IB driver stack for HPC
clusters. The VAPI (i.e., Topspin drivers) support in Open MPI is
pretty static; we'll do critical bug fixes for it, but no new
features and very little functionality testing is occurring there.
All new work is being doing on the OpenIB (a.k.a. OpenFabrics a.k.a.
OFED) drivers for Open MPI.
--
Aaron McDonough
Application and User Support
CSIRO High Performance Scientific Computing
E-mail: aaron.mcdono...@csiro.au
Phone: +61 3 9669-8133, Fax +61 3 9669-8112
Web: http://www.hpsc.csiro.au