Hello all

As I understand, the openib BTL supports NIC failover, but I am confused
about the scope of this support. Let me elaborate:

1. Is the failover support part of MPI specification?

2. Is it an openMPI-specific addition to MPI implementation?

3. Is it a verb-API specification? Since the openib BTL uses only verbs
APIs under the hood, it should work on any NIC (e,g. iWARP or RoCE) that
support verbs, isn't it?

4. Is it an Infiniband specification? Going through my old mail archive, it
seems that openMPI 1.2 release supported this without relying on the IB
automatic path migration functionality, so it seems to me that what openMPI
provides is independent of IB APM (that plus the openib BTL runs on link
types other than Infiniband)

4.1 If it is based on infiniband APM, is this available if I chose to run a
MTL (e.g. PSM) instead of the openib BTL?

5. If my understanding is correct on point #4 above (i.e. the openMPI
failover is independent of any link specific capability of Infiniband),
then why is a similar functionality not provided for other transport type?
The only non-verb transport that I currently have access to is TCP, and I
don't think the TCP transport provides auto-failover.

Can someone with expertise on this please elaborate?
Thanks in advance
Durga


1% of the executables have 99% of CPU privilege!
Userspace code! Unite!! Occupy the kernel!!!

Reply via email to