It sounds like perhaps IOF messages aren't getting relayed along the daemons. Note that the daemon on each node does have to be able to send TCP messages to all other nodes, not just mpirun.

Couple of things you can do to check:

1. -mca routed direct - this will send all messages direct instead of across the daemons

2. --leave-session-attached - will allow you to see any errors reported by the daemons, including those from attempting to relay messages

Ralph

On Jul 29, 2009, at 1:19 PM, David Doria wrote:

I wrote a simple program to display "hello world" from each process.

When I run this (126 - my machine, 122, and 123), everything works fine: [doriad@daviddoria MPITest]$ mpirun -H 10.1.2.126,10.1.2.122,10.1.2.123 hello-mpi
>From process 1 out of 3, Hello World!
From process 2 out of 3, Hello World!
From process 3 out of 3, Hello World!

When I run this (126 - my machine, 122, and 125), everything works fine: [doriad@daviddoria MPITest]$ mpirun -H 10.1.2.126,10.1.2.122,10.1.2.125 hello-mpi
>From process 2 out of 3, Hello World!
From process 1 out of 3, Hello World!
From process 3 out of 3, Hello World!

When I run this (126 - my machine, 123, and 125), everything works fine: [doriad@daviddoria MPITest]$ mpirun -H 10.1.2.126,10.1.2.123,10.1.2.125 hello-mpi
>From process 2 out of 3, Hello World!
From process 1 out of 3, Hello World!
From process 3 out of 3, Hello World!


However, when I run this (126 - my machine, 122, 123, AND 125), I get no output at all.

Is there any way to check what is going on / does anyone know what that would happen? I'm using OpenMPI 1.3.3

Thanks,

David
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to