"Kevin M. Hildebrand" <ke...@umd.edu> writes:

> Hi, I'm trying to run an OpenMPI 1.6.5 job across a set of nodes, some
> with Mellanox cards and some with Qlogic cards.

Maybe you shouldn't...  (I'm blessed in one cluster with three somewhat
incompatible types of QLogic card and a set of Mellanox ones, but
they're in separate islands, apart from the two different SDR ones.)

> I'm getting errors indicating "At least one pair of MPI processes are unable 
> to reach each other for MPI communications".  As far as I can tell all of the 
> nodes are properly configured and able to reach each other, via IP and non-IP 
> connections.
> I've also discovered that even if I turn off the IB transport via "--mca btl 
> tcp,self" I'm still getting the same issue.
> The test works fine if I run it confined to hosts with identical IB cards.
> I'd appreciate some assistance in figuring out what I'm doing wrong.

I assume the QLogic cards are using PSM.  You'd need to force them to
use openib with something like --mca mtl ^psm and make sure they have
the ipathverbs library available.  You probably won't like the resulting
performance -- users here noticed when one set fell back to openib from
psm recently.

Reply via email to