It might be best to:
1. Setup a non-root user to run MPI applications
2. Setup SSH keys between the hosts for this non-root user so that you
can "ssh <otherhost> uptime" and not be prompted for a password/
passphrase
This should help.
On Apr 4, 2009, at 5:51 AM, Ankush Kaul wrote:
I followed the steps given here to setup up openMPI cluster :
http://www.ps3cluster.umassd.edu/step3mpi.html
My cluster consists of two nodes, master(192.168.67.18) and
salve(192.168.45.65), connected directly through a cross cable.
After setting up the cluster n configuring the master node, i
mounted /tmp folder of master node on the slave node(i had some
problems with nfs at first but i worked my way out of it).
Then i copied the 'pi.c' program in the /tmp folder and successfully
complied it, giving me a binary file 'pi'.
Now when i try to run the binary file using the following command
#mpirun –np 2 ./Pi
root@192.168.45.65's password:
<it asks for the password>
after entering the password it gives the following error:
bash: orted: command not found
[ccomp.cluster:18963] [0,0,0] ORTE_ERROR_LOG: Timeout in file base/
pls_base_orted_cmds.c at line 275
[ccomp.cluster:18963] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1166
[ccomp.cluster:18963] [0,0,0] ORTE_ERROR_LOG: Timeout in file
errmgr_hnp.c at line 90
[ccomp.cluster:18963] ERROR: A daemon on node 192.168.45.65 failed
to start as expected.
[ccomp.cluster:18963] ERROR: There may be more information available
from
[ccomp.cluster:18963] ERROR: the remote shell (see above).
[ccomp.cluster:18963] ERROR: The daemon exited unexpectedly with
status 127.
[ccomp.cluster:18963] [0,0,0] ORTE_ERROR_LOG: Timeout in file base/
pls_base_orted_cmds.c at line 188
[ccomp.cluster:18963] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1198
--------------------------------------------------------------------------
mpirun was unable to cleanly terminate the daemons for this job.
Returned value Timeout instead of ORTE_SUCCESS.
--------------------------------------------------------------------------
I am totally lost now, as this is the first time i am working on a
cluster project, and need some help
Thank you
Ankush
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
Cisco Systems