If OMPI finds infiniband support on the node, it will attempt to use it. In this case, it would appear you have an incorrectly configured IB adaptor on the node, so you get the additional warning about that fact.
OMPI then falls back to look for another transport, in this case TCP. However, the TCP transport is unable to create a socket to the remote host. The most likely cause is a firewall, so you might want to check that and turn it off. On Jan 19, 2014, at 4:19 AM, Syed Ahsan Ali <ahsansha...@gmail.com> wrote: > Dear All > > I am getting infiniband errors while running mpirun applications on cluster. > I get these errors even when I don't include infiniband usage flags in mpirun > command. Please guide > > mpirun -np 72 -hostfile hostlist ../bin/regcmMPI regcm.in > > -------------------------------------------------------------------------- > [[59183,1],24]: A high-performance Open MPI point-to-point messaging module > was unable to find any relevant network interfaces: > Module: OpenFabrics (openib) > Host: compute-01-10.private.dns.zone > > Another transport will be used instead, although this may result in > lower performance. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > WARNING: There are more than one active ports on host > 'compute-01-15.private.dns.zone', but the > default subnet GID prefix was detected on more than one of these > ports. If these ports are connected to different physical IB > networks, this configuration will fail in Open MPI. This version of > Open MPI requires that every physically separate IB subnet that is > used between connected MPI processes must have different subnet ID > values. > > Please see this FAQ entry for more details: > > http://www.open-mpi.org/faq/?category=openfabrics#ofa-default-subnet-gid > > NOTE: You can turn off this warning by setting the MCA parameter > btl_openib_warn_default_gid_prefix to 0. > -------------------------------------------------------------------------- > > This is RegCM trunk > SVN Revision: tag 4.3.5.6 compiled at: data : Sep 3 2013 time: 05:10:53 > > [pmd.pakmet.com:03309] 15 more processes have sent help message > help-mpi-btl-base.txt / btl:no-nics > [pmd.pakmet.com:03309] Set MCA parameter "orte_base_help_aggregate" to 0 to > see all help / error messages > [pmd.pakmet.com:03309] 47 more processes have sent help message > help-mpi-btl-openib.txt / default subnet prefix > [compute-01-03.private.dns.zone][[59183,1],1][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect] > connect() to 192.168.108.10 failed: No route to host (113) > [compute-01-03.private.dns.zone][[59183,1],2][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect] > connect() to 192.168.108.10 failed: No route to host (113) > [compute-01-03.private.dns.zone][[59183,1],5][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect] > connect() to 192.168.108.10 failed: No route to host (113) > [compute-01-03.private.dns.zone][[59183,1],3][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect] > > [compute-01-03.private.dns.zone][[59183,1],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect] > connect() to 192.168.108.10 failed: No route to host (113) > [compute-01-03.private.dns.zone][[59183,1],7][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect] > connect() to 192.168.108.10 failed: No route to host (113) > connect() to 192.168.108.10 failed: No route to host (113) > [compute-01-03.private.dns.zone][[59183,1],6][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect] > connect() to 192.168.108.10 failed: No route to host (113) > [compute-01-03.private.dns.zone][[59183,1],4][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect] > connect() to 192.168.108.10 failed: No route to host (113) > > Ahsan > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users