Greetings,

I am using OpenMPI v1.2.3 via SGE on a network of amd64 workstations. When mpirun tries to start the processes on certain nodes I get the following error output.

[sr70][0,1,2][btl_tcp_endpoint.c: 572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=111 [sr71][0,1,3][btl_tcp_endpoint.c: 572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=111

Using perl -e 'die$!=111' I see that the error message is "Connection refused". I am able to connect to both nodes in question via ssh and/ or rsh. I changed btl_base_debug to 2, but that did not provide additional information.

What are some possible issues that might be causing this? What can I do to get more information?

Thanks,
~Tim


Reply via email to