On Mar 29, 2007, at 1:08 PM, Jens Klostermann wrote:

In reply to
http://www.open-mpi.org/community/lists/users/2006/12/2286.php

I recently switched to openmpi1.2 unfortunately the password problem
still persists! I generated new rsa keys and made passwordless ssh
available. This was tested by login to each node per passwordless ssh,
fortunately there are only 16 nodes:-).
The funny thing is it seems to be a problem only with my user and
appears randomly, but more likely if I uses more nodes.

Is the problem still something like this:

----
[say_at_wolf45 tmp]$ mpirun -np 2 --host wolf45,wolf46 /tmp/test.x
orted: Command not found.
-----

Because if so, it's a larger / non-MPI issue. If the orted executable cannot be found on the remote node, there's no way Open MPI will succeed.

The question of *why* the orted can't be found may be a bit deeper of a problem -- if you have your PATH set right, etc., perhaps it's an NFS issue...?

One cure for the problem until now is using the option --mca
pls_rsh_debug. What does this switch do other than producing more output
that this resolves my problem?

It also slows the code down a bit such that the timing is different.

Two other questions what is the
-ras (Resource allocation subsystem): and how can I set this up/what
options to have

I would doubt that the ras is involved in the issue -- the ras is used to read hostfiles, analyze lists of hosts from resource managers, etc. It doesn't actually do anything in the actual launch.

pls (Process launch subsystem): and how can I set this up/what options
to have?

I assume you're using the RSH launcher; you can use the ompi_info command to see what parameters are available for that component:

     ompi_info --param pls rsh

--
Jeff Squyres
Cisco Systems

Reply via email to