When I run jobs with torque, I get this error message.  Any ideas?

[sam@prodnode1 all]$ cat script.sh.err 
Host key verification failed.
[prodnode3.brooks.af.mil:03321] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 275
[prodnode3.brooks.af.mil:03321] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1164
[prodnode3.brooks.af.mil:03321] [0,0,0] ORTE_ERROR_LOG: Timeout in file
errmgr_hnp.c at line 90
[prodnode3.brooks.af.mil:03321] ERROR: A daemon on node
prodnode2.brooks.af.mil failed to start as expected.
[prodnode3.brooks.af.mil:03321] ERROR: There may be more information
available from
[prodnode3.brooks.af.mil:03321] ERROR: the remote shell (see above).
[prodnode3.brooks.af.mil:03321] ERROR: The daemon exited unexpectedly
with status 255.
[prodnode3.brooks.af.mil:03321] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 188
[prodnode3.brooks.af.mil:03321] [0,0,0] ORTE_ERROR_LOG: Timeout in file
pls_rsh_module.c at line 1196
------------------------------------------------------------------------
--
mpirun was unable to cleanly terminate the daemons for this job.
Returned value Timeout instead of ORTE_SUCCESS.

------------------------------------------------------------------------
--

Sam Adams
General Dynamics Information Technology
Phone: 210.536.5945


Reply via email to