I have just created a small cluster consisting of three nodes:
bellhuey AMD 64 with 4 cores
wolf1 AMD 64 with 2 cores
wolf2 AMD 64 with 2 cores
The host file is:
bellhuey slots=4
wolf1 slots=2
wolf2 slots=2
bellhuey is the master and wolf1 and wolf2 share the /usr and /home file
systems via NFS
I am running mpi 1.4.1.
I have the following simple program:
#include <mpi.h>
#include <stdio.h>
#include <unistd.h>
int main (int argc, char* argv[]) {
int myid, numprocs;
char me[255];
int n;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &myid);
gethostname(me, 254);
printf("Hello from %s I am process %d of %d\n", me, myid, numprocs);
if (myid == 0) {
n = 12345;
}
printf("Call to MPI_Bcast n==%d on %s myid=%d\n", n, me, myid);
MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
printf("Return from MPI_Bcast n==%d on %s myid=%d\n", n, me, myid);
MPI_Finalize();
return 0;
}
If I run this with
mpirun -np 8 hello
it works fine, but all processes run on bellhuey
If I run this with
mpirun -np 8 --hostfile host hello
I get the following:
Hello from bellhuey I am process 0 of 8
Call to MPI_Bcast n==12345 on bellhuey myid=0
Hello from bellhuey I am process 1 of 8
Call to MPI_Bcast n==32767 on bellhuey myid=1
Hello from bellhuey I am process 2 of 8
Call to MPI_Bcast n==32767 on bellhuey myid=2
Hello from wolf1 I am process 5 of 8
Call to MPI_Bcast n==32767 on wolf1 myid=5
Hello from bellhuey I am process 3 of 8
Call to MPI_Bcast n==32767 on bellhuey myid=3
Hello from wolf2 I am process 7 of 8
Call to MPI_Bcast n==32767 on wolf2 myid=7
Hello from wolf2 I am process 6 of 8
Call to MPI_Bcast n==32767 on wolf2 myid=6
Hello from wolf1 I am process 4 of 8
Call to MPI_Bcast n==32767 on wolf1 myid=4
As expected 4 processes are started on bellhuey and two processes each
on wolf1 and wolf2.
However, none of the calls to MPI_Bcast return!
Any help would be appreciated.
Paul Wolfgang