Hi: I have a "cluster" consisting of a dual Opteron system (called a.lan) and a dual AthlonMP system (b.lan). Both systems are running Red Hat Enterprise Linux 4. The opteron system runs in 64-bit mode; the AthlonMP in 32-bit. I can't seem to make OpenMPI work between these two machines. I've tried 1.1.2, 1.1.3b1, and 1.2b1 and they all exhibit the same behavior, namely that Bcasts won't complete. Here's my simple.cpp test program:
#include <iostream> #include "mpi.h" int main ( int argc, char* argv[] ) { MPI_Init( &argc, &argv ); char hostname[256]; int hostname_size = sizeof(hostname); MPI_Get_processor_name( hostname, &hostname_size ); std::cout << "Running on " << hostname << std::endl; std::cout << hostname << " in to Bcast" << std::endl; double a = 3.14159; MPI_Bcast( &a, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD ); std::cout << hostname << " out of Bcast" << std::endl; MPI_Finalize(); return 0; } I compile this and run it with "mpirun --host a.lan --host b.lan simple". Generally, if I'm on a.lan, I see: Running on a.lan a.lan in to Bcast Running on b.lan a.lan out of Bcast b.lan in to Bcast <then both processes hang, with the one on b.lan at 100% cpu> If I launch from b.lan, then the reverse happens (i.e., it exits the Bcast on b.lan, but never exits Bcast on a.lan and a.lan uses 100% cpu). On the other hand, I have another 32-bit system (just a plain Athlon running RHEL 4, called c.lan). My test program runs fine between b.lan and c.lan. I feel like I must be making an incredibly obvious mistake. Thanks, Allen -- Allen Barnett Transpire, Inc. E-Mail: al...@transpireinc.com Ph: 518-887-2930