On Thu, Sep 13, 2007 at 11:15:47AM -0500, Tim Campbell wrote:
> workstations. When mpirun tries to start the processes on certain
> nodes I get the following error output.
>
> [sr70][0,1,2][btl_tcp_endpoint.c:
> 572:mca_btl_tcp_endpoint_complete_connect] connect() failed with
> errno=111
>
Thanks.
I think I figured out the problem. I found that in my .ssh/
known_hosts there were several "bad" keys associated with some of the
machines in the gridengine pool. My hypothesis is that when mpirun
was establishing the connection topology of the processes there was
some process pa
Hi Tim,
You could try setting -mca pls_gridengine_verbose 1 to show whether SGE
is able to start the ORTE daemons on the remote nodes successfully.
It seems you are having the problem previously asked by another user,
Perhaps you may want to follow this thread and check your ifconfig
setting
Greetings,
I am using OpenMPI v1.2.3 via SGE on a network of amd64
workstations. When mpirun tries to start the processes on certain
nodes I get the following error output.
[sr70][0,1,2][btl_tcp_endpoint.c:
572:mca_btl_tcp_endpoint_complete_connect] connect() failed with
errno=111
[sr71