Hi there
i'm facing a strange issue with this HCA. A cluster I support has been
recently expanded with 4 new nodes, all using the mentioned HCA. 3 nodes
are working fine, but one will not use the IB network when running jobs.
Let's call 'node a' the working one, and 'node b' the not working on
Yes Jeff,
You were right. The default value for btl_tcp_port_min_v4 is 1024.
I was facing problem in running my Algorithm on multiple processors (using
ssh).
Answer:
The network administrator locked that port.
:(
i changed the communication port by forcing mpi to use another.
mpiexec -n 2 --hos