I've checked the links repeatedly with "ibstatus" and they look OK. Both nodes shoe a link layer of "InfiniBand".
As I stated, everything works well with MVAPICH2, so I don't suspect a physical or link layer problem (but I could always be wrong on that). Tim On Fri, May 9, 2014 at 6:26 PM, Joshua Ladd <jladd.m...@gmail.com> wrote: > Hi, Tim > > Run "ibstat" on each host: > > 1. Make sure the adapters are alive and active. > > 2. Look at the Link Layer settings for host w34. Does it match host w4's? > > > Josh > > > On Fri, May 9, 2014 at 1:18 PM, Tim Miller <btamil...@gmail.com> wrote: > >> Hi All, >> >> We're using OpenMPI 1.7.3 with Mellanox ConnectX InfiniBand adapters, and >> periodically our jobs abort at start-up with the following error: >> >> === >> Open MPI detected two different OpenFabrics transport types in the same >> Infiniband network. >> Such mixed network trasport configuration is not supported by Open MPI. >> >> Local host: w4 >> Local adapter: mlx4_0 (vendor 0x2c9, part ID 26428) >> Local transport type: MCA_BTL_OPENIB_TRANSPORT_IB >> >> Remote host: w34 >> Remote Adapter: (vendor 0x2c9, part ID 26428) >> Remote transport type: MCA_BTL_OPENIB_TRANSPORT_UNKNOWN >> === >> >> I've done a bit of googling and not found very much. We do not see this >> issue when we run with MVAPICH2 on the same sets of nodes. >> >> Any advice or thoughts would be very welcome, as I am stumped by what >> causes this. The nodes are all running Scientific Linux 6 with Mellanox >> drivers installed via the SL-provided RPMs. >> >> Tim >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >