I have three machines: mine (daviddoria) and two identical remote machines
(cloud3 and cloud6). I can password-less ssh between any pair. The machines
are all 32bit running Fedora 11. OpenMPI was installed identically on each.
The .bashrc is identical on each. /etc/hosts is identical on each.


I wrote a test "hello world" program to ensure OpenMPI is behaving
correctly.


The output is exactly as expected, each node seems to be alive.


[doriad@daviddoria MPITest]$ mpirun -H cloud6,daviddoria,cloud3 -np 3
hello-mpi
Process 1 on daviddoria out of 3
Process 2 on cloud3 out of 3
Process 0 on cloud6 out of 3


I am trying to get a parallel application called Paraview working with these
three machines. Paraview is installed identically on each. As a test, I
wanted to get it working with two at a time first.


With cloud3, everything goes smoothly, that is, I tell Paraview to start the
server with

ssh cloud3 mpirun -H cloud3 pvserver

and to connect to the server on cloud3, and I get the following (expected)
output:


Listen on port: 11111

 Waiting for client...

 Client connected.


When I try the same thing on cloud6, it again goes smoothly

(I tell Paraview to start the server with

ssh cloud6 mpirun -H cloud6 pvserver

and connect to the server on cloud6)


Now for the real test...

I tell Paraview to start the server with

ssh cloud6 mpirun -H cloud6,cloud3 -np 2 pvserver

and connect to the server on cloud6


This again connects successfully. However, if I do the reverse:


ssh cloud3 mpirun -H cloud3,cloud6 -np 2 pvserver

and connect to the server on cloud3


 it tries and tries for 60 seconds but it can't connect. I just see a bunch
of "Failed to connect to server on cloud3" errors.


Does anyone have any idea what could cause this asymmetric behavior?


Thanks,

David

Reply via email to