----- Forwarded Message -----
From: Hamilton Fischer <fischerhamil...@yahoo.com>
To: "u...@open-mpi.org" <u...@open-mpi.org> 
Sent: Monday, January 16, 2012 9:09 PM
Subject: unknown af_family recieved errors...
 

Hi, I'm having odd issues with my "cluster", I guess. This very simple example 
works on one machine, but it gives a load of errors and hangs afterwards when I 
try to make it work on parrallelize it across the network.


#include <stdio.h>
#include "mpi.h"

int
main(int argc, char *argv[])
{
  int rank, size;
  MPI_Init(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);

  if (rank == 0)
    {
      int i;
      for(i=1; i < size; ++i)
    {
      int s=1;
      MPI_Send(&s, 1, MPI_INT, i, 1, MPI_COMM_WORLD);
    }
    }
  else
    {
      int r;
      MPI_Recv(&r, 1, MPI_INT, 0, 1, MPI_COMM_WORLD, NULL);
      printf("%d got a %d\n", rank, r);
    }
  MPI_Finalize();
  return 0;
}


If I do `mpirun -np 3 a.out', where a.out is the executable, I get obvious 
output:

1 got a 1
2 got a 1


Now, let's say I go on the network. I use `mpirun --hostfile ../combin_host 
a.out', where my hostfile is simply:

# Hostfile
angryrock@192.168.0.1 slots=4
# Hostfile
user@192.168.0.102 slots=2
user@192.168.0.103 slots=2
user@192.168.0.104 slots=2
user@192.168.0.105 slots=2


I get this...


[localhost:04756] mca_btl_tcp_proc: unknown af_family received: 1
[localhost:04756] unknown address family for tcp: 0
[localhost:04756] mca_btl_tcp_proc: unknown af_family received: 1
[localhost:04756] unknown address family for tcp: 0
[localhost:04610] mca_btl_tcp_proc: unknown af_family received: 1
[localhost:04610] unknown address family for tcp: 0
[localhost:04048] mca_btl_tcp_proc: unknown af_family received: 1

...
[localhost:04123] unknown address family for tcp: 0
1 got a 1
2 got a 1
3 got a 1
^Cmpirun: killing job...

The ellipsis encompases a few lines of the same thing probably for each host. 
The ending part no doubt is a.out executing on my machine. As is obvious, at 
the end, I have to kill it because it hangs.

Any help as to what my issue might be? It obviously is an installation issue...

Thanks,
noobermin

Reply via email to