Jeff Squyres wrote:
can OpenMPI also deal with one of the subnets failing?
ie. will OpenMPI automatically fall back to using the last remaining
working IB port out of a node, or even fallback to GigE if all the IB
fails?

Not in the 1.2 series.

The 1.3 series *may* include "APM" support (automatic path migration -- a feature in IB). It looks positive that that'll make the 1.3 cut, but I don't have definite information yet.
Current ompi-trunk have APM implementation. If you enable APM ompi will use only first port on the HCA for data transmission and second one will be reserver for back-up. On network failure on the first port all connections will migrate to second port. The APM works only on the HCA level - I mean that you can not migrate between
different HCAs, you can migrate only between 2 ports of the same HCA.


--
Pavel Shamis (Pasha)
Mellanox Technologies

Reply via email to