"Kevin M. Hildebrand" <ke...@umd.edu> writes: > Hi, I'm trying to run an OpenMPI 1.6.5 job across a set of nodes, some > with Mellanox cards and some with Qlogic cards.
Maybe you shouldn't... (I'm blessed in one cluster with three somewhat incompatible types of QLogic card and a set of Mellanox ones, but they're in separate islands, apart from the two different SDR ones.) > I'm getting errors indicating "At least one pair of MPI processes are unable > to reach each other for MPI communications". As far as I can tell all of the > nodes are properly configured and able to reach each other, via IP and non-IP > connections. > I've also discovered that even if I turn off the IB transport via "--mca btl > tcp,self" I'm still getting the same issue. > The test works fine if I run it confined to hosts with identical IB cards. > I'd appreciate some assistance in figuring out what I'm doing wrong. I assume the QLogic cards are using PSM. You'd need to force them to use openib with something like --mca mtl ^psm and make sure they have the ipathverbs library available. You probably won't like the resulting performance -- users here noticed when one set fell back to openib from psm recently.