Could you re-enable the SL param (btl_openib_ib_service_level) for
RoCE?  Jeff was kind enough to provide a patch to let me specify the
gid_index, but that doesn't seem to be working.  To get RoCE to work
correctly (at least, on Nexus switches) I'll need to specify both a
gid_index and an IB service level.  I think. :-)

Also, while the rdmacm connection manager is required for RoCE, it's
not selected by default (like it is for iWARP).  You still need to add
that to a config file or command line, or you get a rather cryptic
option (at least up through OpenMPI 1.5.1).

--
Mike Shuey



On Mon, Feb 21, 2011 at 12:34 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
> Random thought: is there a check to ensure that the SL MCA param is not set 
> in a RoCE environment?  If not, we should probably add a show_help warning if 
> the SL MCA param is set when using RoCE (i.e., that its value will be 
> ignored).
>
>
> On Feb 19, 2011, at 12:22 AM, Shamis, Pavel wrote:
>
>> As far as I remember we don't allow to user to specify SL for RoCE. RoCE 
>> considered kinda ethernet device and RDMACM connection manager is used to 
>> setup the connections. it means that in order to select network X  or Y, you 
>> may use ip/netmask (btl_openib_ipaddr_include) .
>>
>> Pavel (Pasha) Shamis
>> ---
>> Application Performance Tools Group
>> Computer Science and Math Division
>> Oak Ridge National Laboratory
>>
>>
>>
>>
>>
>>
>> On Feb 18, 2011, at 4:14 PM, Michael Shuey wrote:
>>
>>> Per-node GID & SL settings == bad.  Site-wide GID & SL settings == good.
>>>
>>> If this could be an MCA param (like btl_openib_ib_service_level)
>>> that'd be great - we already have a global config file of similar
>>> params.  We'd definitely want the same N everywhere.
>>>
>>> --
>>> Mike Shuey
>>>
>>>
>>>
>>> On Fri, Feb 18, 2011 at 3:44 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
>>>> On Feb 18, 2011, at 1:39 PM, Michael Shuey wrote:
>>>>
>>>>> RoCE HCAs keep a GID table, like normal HCAs.  Every time you bring up
>>>>> a vlan interface, another entry gets automatically added to the table.
>>>>> If I select one of these other GIDs, packets get a VLAN tag, and that
>>>>> contains the necessary priority bits (well, assuming I selected the
>>>>> right IB service level, which is mapped to the priority tag in the
>>>>> VLAN header) for the traffic to match a lossless class of service on
>>>>> the switch.
>>>>
>>>> Ah -- I see it now (it's been a looong time since I've looked in Open 
>>>> MPI's verbs code!).  We query and simply take the 0th GID from a given IBV 
>>>> device port's GID table.
>>>>
>>>>> For this to work, I really need for the IB client to select a
>>>>> non-default GID.  A few test programs included in OFED will do this,
>>>>> but I'm not sure OpenMPI will.  Any thoughts?
>>>>
>>>> Yes, we can do this.  It's pretty easy to add an MCA parameter to select 
>>>> the Nth GID rather than always taking the 0th.
>>>>
>>>> To make this simple, can you make it so that the value of N is the same 
>>>> across all nodes in your cluster?  Then you can set a site-wide MCA param 
>>>> for that value of N and be done with this issue.  If we have to have a 
>>>> per-node setting of N, it could get a little hairy (it's do-able, but... 
>>>> it's a heckuva lot easier if N is the same everywhere).
>>>>
>>>> --
>>>> Jeff Squyres
>>>> jsquy...@cisco.com
>>>> For corporate legal information go to:
>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>
>>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to