Sounds like either a routing problem or a firewall. Are there multiple NICs on
these nodes? Looking at the quoted NIC in your error message, is that the
correct subnet we should be using?
Have you checked to ensure no firewalls exist on that subnet between the nodes?
On Apr 24, 2014, at 8:41 A
Dear all:
In the ongoing investigation into why a particular in-house program is
not working in parallel over multiple nodes using OpenMPI, running with
"--mca btl self,sm,tcp" I have been running into the following error:
[compute-6-15.local][[8185,1],0
[btl_tcp_endpoint.c:653:mca_btl_tcp_