I have just created a small cluster consisting of three nodes:
   bellhuey   AMD 64 with 4 cores
   wolf1   AMD 64 with 2 cores
   wolf2   AMD 64 with 2 cores

The host file is:

bellhuey slots=4
wolf1 slots=2
wolf2 slots=2

bellhuey is the master and wolf1 and wolf2 share the /usr and /home file systems via NFS
I am running mpi 1.4.1.
I have the following simple program:

#include <mpi.h>
#include <stdio.h>
#include <unistd.h>
int main (int argc, char* argv[]) {
 int myid, numprocs;
 char me[255];
 int n;
 MPI_Init(&argc, &argv);
 MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
 MPI_Comm_rank(MPI_COMM_WORLD, &myid);
 gethostname(me, 254);
 printf("Hello from %s I am process %d of %d\n", me, myid, numprocs);
 if (myid == 0) {
   n = 12345;
 }
 printf("Call to MPI_Bcast n==%d on %s myid=%d\n", n, me, myid);
 MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
 printf("Return from MPI_Bcast n==%d on %s myid=%d\n", n, me, myid);
 MPI_Finalize();
 return 0;
}

If I run this with
   mpirun -np 8 hello
it works fine, but all processes run on bellhuey

If I run this with
   mpirun -np 8 --hostfile host hello
I get the following:

Hello from bellhuey I am process 0 of 8
Call to MPI_Bcast n==12345 on bellhuey myid=0
Hello from bellhuey I am process 1 of 8
Call to MPI_Bcast n==32767 on bellhuey myid=1
Hello from bellhuey I am process 2 of 8
Call to MPI_Bcast n==32767 on bellhuey myid=2
Hello from wolf1 I am process 5 of 8
Call to MPI_Bcast n==32767 on wolf1 myid=5
Hello from bellhuey I am process 3 of 8
Call to MPI_Bcast n==32767 on bellhuey myid=3
Hello from wolf2 I am process 7 of 8
Call to MPI_Bcast n==32767 on wolf2 myid=7
Hello from wolf2 I am process 6 of 8
Call to MPI_Bcast n==32767 on wolf2 myid=6
Hello from wolf1 I am process 4 of 8
Call to MPI_Bcast n==32767 on wolf1 myid=4

As expected 4 processes are started on bellhuey and two processes each on wolf1 and wolf2.
However, none of the calls to MPI_Bcast return!

Any help would be appreciated.

Paul Wolfgang




Reply via email to