Re: [OMPI users] mixing MX and TCP

Reese Faucette Fri, 1 Jun 2007 12:48:50 -0400

Just to brainstorm on this a little - the two different clusters will havedifferent "mapper IDs", and this can be learned via the attached codesnippet. As long as fma is the mapper (as opposed the the older, deprecated"gm_mapper" or "mx_mapper"), then Myrinet topology rules ensure that NIC 0,port 0 is all you need to examine. All nodes with the same mapper can thenbe considered "on the same fabric"

Except, of course, when you have two fabrics A and B with many nodes eachbut only one node in common - then, all will have the same mapper ID, butare effectively two disjoint fabrics. This is rare, but i have seen itonce.

Perhaps a more general solution is for the MX MTL to look in the MX peertable for a requested peer (or simply try mx_connect() and notice itfails?) and report "cannot reach" back up the chain and have higher levelcode retry with a different medium on a per-peer basis? This would beindependent of IB or MX or ...


===================================
#include <stdio.h>
#include <stdlib.h>
#include "myriexpress.h"
#include "mx_io.h"

main()
{
 mx_return_t ret;
 mx_endpt_handle_t h;
 mx_mapper_state_t ms;
 int board = 0;                /* whichever board you want */

 mx_init();
 ret = mx_open_board(board, &h);
 if (ret != MX_SUCCESS) {
   fprintf(stderr, "Unable to open board %d\n", board);
   exit(1);
 }

 ms.board_number = board;
 ms.iport = 0;
 ret = mx__get_mapper_state(h, &ms);
 if (ret != MX_SUCCESS) {
   fprintf(stderr, "get_mapper_state failed for board %d: %s\n",
       board, mx_strerror(ret));
   exit(1);
 }

 printf("mapper = %2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x\n",
        ms.mapper_mac[0] & 0xff, ms.mapper_mac[1] & 0xff,
        ms.mapper_mac[2] & 0xff, ms.mapper_mac[3] & 0xff,
        ms.mapper_mac[4] & 0xff, ms.mapper_mac[5] & 0xff);
 exit(0);
}

Re: [OMPI users] mixing MX and TCP

Reply via email to