On Mar 13, 2009, at 6:17 AM, Raymond Wan wrote:
What doesn't work is:
[On Y] mpirun --host Y,Z --np 2 uname -a
[On Y] mpirun --host X,Y,Z --np 3 uname -a
...and similarly for machine Z. I can confirm that from any of the
3 machines, I can ssh to the other without typing in a password. I
set up the RSA keys correctly [I think]. When I run the above
commands, it just hangs. Adding "--verbose" doesn't produce any
information...I don't know what it's doing. I had a longer running
program than "uname" and I didn't see it appear on any of the
machines. In fact [since it hangs], I don't see uname on "top",
either. I do, however, see "mpirun" and "orted" on top, though.
I guess some setup is missing that X has that the other two do not
have. Any suggestions on how to find out the cause of this
problem? Thank you!
Do you see "rsh" or "ssh" in the output of "ps -eadf" when mpirun is
hanging, perchance? If you, what happens if you copy-n-paste those
command lines and run them manually?
--
Jeff Squyres
Cisco Systems