On Mar 29, 2007, at 1:08 PM, Jens Klostermann wrote:
In reply to
http://www.open-mpi.org/community/lists/users/2006/12/2286.php
I recently switched to openmpi1.2 unfortunately the password problem
still persists! I generated new rsa keys and made passwordless ssh
available. This was tested by login to each node per passwordless ssh,
fortunately there are only 16 nodes:-).
The funny thing is it seems to be a problem only with my user and
appears randomly, but more likely if I uses more nodes.
Is the problem still something like this:
----
[say_at_wolf45 tmp]$ mpirun -np 2 --host wolf45,wolf46 /tmp/test.x
orted: Command not found.
-----
Because if so, it's a larger / non-MPI issue. If the orted
executable cannot be found on the remote node, there's no way Open
MPI will succeed.
The question of *why* the orted can't be found may be a bit deeper of
a problem -- if you have your PATH set right, etc., perhaps it's an
NFS issue...?
One cure for the problem until now is using the option --mca
pls_rsh_debug. What does this switch do other than producing more
output
that this resolves my problem?
It also slows the code down a bit such that the timing is different.
Two other questions what is the
-ras (Resource allocation subsystem): and how can I set this up/what
options to have
I would doubt that the ras is involved in the issue -- the ras is
used to read hostfiles, analyze lists of hosts from resource
managers, etc. It doesn't actually do anything in the actual launch.
pls (Process launch subsystem): and how can I set this up/what options
to have?
I assume you're using the RSH launcher; you can use the ompi_info
command to see what parameters are available for that component:
ompi_info --param pls rsh
--
Jeff Squyres
Cisco Systems