It's a little different in RoCE.  There's no subnet manager, so (as
near as I can tell) you don't really have a subnet ID.  Instead, the
GID = GUID + VLAN tag (more or less).  gid[0] has special bits in the
VLAN tag section, to indicate that packets relating to this GID don't
get a VLAN tag.  Unfortunately, without a VLAN tag, those packets lack
priority bits - meaning they can't be matched to a lossless class on
our Cisco switches.

RoCE HCAs keep a GID table, like normal HCAs.  Every time you bring up
a vlan interface, another entry gets automatically added to the table.
 If I select one of these other GIDs, packets get a VLAN tag, and that
contains the necessary priority bits (well, assuming I selected the
right IB service level, which is mapped to the priority tag in the
VLAN header) for the traffic to match a lossless class of service on
the switch.

For this to work, I really need for the IB client to select a
non-default GID.  A few test programs included in OFED will do this,
but I'm not sure OpenMPI will.  Any thoughts?

--
Mike Shuey



On Fri, Feb 18, 2011 at 9:30 AM, Jeff Squyres <jsquy...@cisco.com> wrote:
> Greetings Mike.  I'll answer today because Fri-Sat is the weekend in Israel 
> (i.e., the MPI team at Mellanox won't see this until Sunday).
>
> I don't have a lot of experience with RoCE; do you need a different GUID or a 
> different subnet ID?  At least in IB, the GID = GUID + Subnet ID.  The GUID 
> should be your unique port ID and the subnet ID is, well, the subnet ID.  :-)
>
> Changing either of these in IB is an administrative function, not a 
> user-level function.  Meaning: I'm *guessing* that the same is true for RoCE 
> -- changing the subnet ID (which is what I'm further guessing you need to do) 
> should be somewhere in the root-level setup for RoCE.  Once you set a 
> different subnet ID, Open MPI should just use it.
>
>
> On Feb 18, 2011, at 8:17 AM, Michael Shuey wrote:
>
>> I've been looking into OpenMPI's support for RoCE (Mellanox's recent
>> Infiniband-over-Ethernet) lately.  While it's promising, I've hit a
>> snag: RoCE requires lossless ethernet, and on my switches the only way
>> to guarantee this is with CoS.  RoCE adapters cannot emit CoS priority
>> tags unless the client program selects an IB service level and uses a
>> non-default GID.
>>
>> There's a command-line option in OpenMPI to pick an IB SL, but I can't
>> find one for picking a different GID.  Does this exist for the openib
>> btl?  Or am I going about this the wrong way?
>>
>> --
>> Mike Shuey
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to