Hello all As I understand, the openib BTL supports NIC failover, but I am confused about the scope of this support. Let me elaborate:
1. Is the failover support part of MPI specification? 2. Is it an openMPI-specific addition to MPI implementation? 3. Is it a verb-API specification? Since the openib BTL uses only verbs APIs under the hood, it should work on any NIC (e,g. iWARP or RoCE) that support verbs, isn't it? 4. Is it an Infiniband specification? Going through my old mail archive, it seems that openMPI 1.2 release supported this without relying on the IB automatic path migration functionality, so it seems to me that what openMPI provides is independent of IB APM (that plus the openib BTL runs on link types other than Infiniband) 4.1 If it is based on infiniband APM, is this available if I chose to run a MTL (e.g. PSM) instead of the openib BTL? 5. If my understanding is correct on point #4 above (i.e. the openMPI failover is independent of any link specific capability of Infiniband), then why is a similar functionality not provided for other transport type? The only non-verb transport that I currently have access to is TCP, and I don't think the TCP transport provides auto-failover. Can someone with expertise on this please elaborate? Thanks in advance Durga 1% of the executables have 99% of CPU privilege! Userspace code! Unite!! Occupy the kernel!!!