I've been looking into OpenMPI's support for RoCE (Mellanox's recent
Infiniband-over-Ethernet) lately. While it's promising, I've hit a
snag: RoCE requires lossless ethernet, and on my switches the only way
to guarantee this is with CoS. RoCE adapters cannot emit CoS priority
tags unless the clie
essing you need to do)
> should be somewhere in the root-level setup for RoCE. Once you set a
> different subnet ID, Open MPI should just use it.
>
>
> On Feb 18, 2011, at 8:17 AM, Michael Shuey wrote:
>
>> I've been looking into OpenMPI's support for RoCE (M
Fri, Feb 18, 2011 at 3:44 PM, Jeff Squyres wrote:
> On Feb 18, 2011, at 1:39 PM, Michael Shuey wrote:
>
>> RoCE HCAs keep a GID table, like normal HCAs. Every time you bring up
>> a vlan interface, another entry gets automatically added to the table.
>> If I select one o
means that in order to select network X or Y, you
>> may use ip/netmask (btl_openib_ipaddr_include) .
>>
>> Pavel (Pasha) Shamis
>> ---
>> Application Performance Tools Group
>> Computer Science and Math Division
>> Oak Ridge National Laboratory
>>
Late yesterday I did have a chance to test the patch Jeff provided
(against 1.4.3 - testing 1.5.x is on the docket for today). While it
works, in that I can specify a gid_index, it doesn't do everything
required - my traffic won't match a lossless CoS on the ethernet
switch. Specifying a GID is o
squy...@cisco.com]
> Sent: Thursday, February 24, 2011 3:45 PM
> To: Michael Shuey
> Cc: Open MPI Users , Mike Dubman
> Subject: Re: [OMPI users] RoCE (IBoE) & OpenMPI
>
> On Feb 24, 2011, at 8:00 AM, Michael Shuey wrote:
>
>> Late yesterday I did have a chance to test t
?
>
>
> On Mar 1, 2011, at 7:35 AM, Michael Shuey wrote:
>
>> So, since RoCE has no SM, and setting an SL is required to get
>> lossless ethernet on Cisco switches (and possibly others), does this
>> mean that RoCE will never work correctly with OpenMPI on Cisco
>> ha
Alternatively, if OpenMPI is really trying to use both ports, you
could force it to use just one port with --mca btl_openib_if_include
mlx4_0:1 (probably)
--
Mike Shuey
On Tue, Mar 1, 2011 at 1:02 PM, Jeff Squyres wrote:
> On Feb 28, 2011, at 12:49 PM, Jagga Soorma wrote:
>
>> -bash-3.2$ mpiex
I'm using RoCE (or rather, attempting to) and need to select a
non-default GID to get my traffic properly classified. Both 1.4.4rc2
and 1.5.4 support the btl_openib_ipaddr_include option, but only 1.5.4
causes my traffic to use the proper GID and VLAN.
Is there something broken with ipaddr_includ