Thinking it may be a driver issue, I configured the same VM's on virtual box 
using Intel PRO/1000 MT Desktop (82540EM) adapters.  Sadly the results were the 
same.  

Is there anything I can do with my configuration that may change the results?

Jacob

-----Original Message-----
From: Nussbaum, Jacob 
Sent: Tuesday, November 25, 2014 8:50 PM
To: Jay Vosburgh
Cc: 'discuss@openvswitch.org'
Subject: RE: [ovs-discuss] LACP bonding issue

Jay,
First off thank you for your response.  

I made the changes, and I had the same issue.  I changed the setup slightly at 
first and set the interface type for the vxlan tunnels as the devices they 
would be utilizing for tunneling.  After that didn't work I went back to the 
previous way I had done it.  

It seems to work when I leave LACP off, and use balance-slb as my bond_mode.  
But again it only transmits traffic across one link at a time.  

In case it helps, these VM's are Ubuntu 14.04 that are running on a 2012 R2 
Hyper V server.  Eth1, Eth2, Eth3, and Eth4 on both Vm's are on their own 
separate network within hyper v.  I was thinking it may be a driver issue with 
hyper v, but wasn't sure if there was a mistake in my configuration.  

Jacob
________________________________________
From: Jay Vosburgh [jay.vosbu...@canonical.com]
Sent: Tuesday, November 25, 2014 12:15 PM
To: Nussbaum, Jacob
Cc: 'discuss@openvswitch.org'
Subject: Re: [ovs-discuss] LACP bonding issue

Nussbaum, Jacob <jdnu...@ilstu.edu> wrote:

>I'm sending this out again hoping someone has an idea because this is 
>baffling me. I have attached my configuration for both VM's.
>
>I'm trying to configure a bonded tunnel between the 2 vm's (2 bridges, 
>S1 and br0)
>
>Each time I set Lacp=active all of my links besides 1 are disabled. The 
>bridges are running on two separate VM's using VXlan tunnels.
>
>Anyone seen anything similar or know of something that may correct this 
>issue?

        I haven't used the OVS LACP implementation much, but I have some 
experience with the bonding LACP implementation, and I see something in your 
status that looks familiar, below.

>user@docker:~$ sudo ovs-vsctl show
>a1a5cdb9-0815-4a70-93f6-6d0eb8d6d32c
>    Bridge "br0"
>        Port "bond0"
>            Interface "vxlan3"
>                type: vxlan
>                options: {remote_ip="10.0.0.12"}
>            Interface "vxlan1"
>                type: vxlan
>                options: {remote_ip="10.0.0.10"}
>            Interface "vxlan4"
>                type: vxlan
>                options: {remote_ip="10.0.0.13"}
>            Interface "vxlan2"
>                type: vxlan
>                options: {remote_ip="10.0.0.11"}
>        Port "vxlan5"
>            Interface "vxlan5"
>        Port "vxlan6"
>            Interface "vxlan6"
>        Port "br0"
>            Interface "br0"
>                type: internal
>        Port "vxlan8"
>            Interface "vxlan8"
>        Port "vxlan7"
>            Interface "vxlan7"
>    ovs_version: "2.0.2"
>user@docker:~$ sudo ovs-appctl bond/show bond0
>---- bond0 ----
>bond_mode: balance-tcp
>bond-hash-basis: 0
>updelay: 0 ms
>downdelay: 0 ms
>next rebalance: 2120 ms
>lacp_status: negotiated
>
>slave vxlan1: disabled
>        may_enable: false
>
>slave vxlan2: enabled
>        active slave
>        may_enable: true
>
>slave vxlan3: disabled
>        may_enable: false
>
>slave vxlan4: disabled
>        may_enable: false
>
>               user@docker:~$ sudo ovs-appctl lacp/show
>---- bond0 ----
>        status: passive negotiated
>        sys_id: 9e:78:7e:1f:09:44
>        sys_priority: 65534
>        aggregation key: 1
>        lacp_time: slow
>
>slave: vxlan1: defaulted detached
>        port_id: 1
>        port_priority: 65535
>        may_enable: false
>
>        actor sys_id: 9e:78:7e:1f:09:44
>        actor sys_priority: 65534
>        actor port_id: 1
>        actor port_priority: 65535
>        actor key: 1
>        actor state: aggregation defaulted
>
>        partner sys_id: 00:00:00:00:00:00
>        partner sys_priority: 0
>        partner port_id: 0
>        partner port_priority: 0
>        partner key: 0
>        partner state:

        In my experience with the bonding LACP implementation, the above 
(partner MAC, et al, all zeroes) indicates that the port in question is not 
receiving LACPDUs from the link partner.

>slave: vxlan2: current attached
>        port_id: 2
>        port_priority: 65535
>        may_enable: true
>
>        actor sys_id: 9e:78:7e:1f:09:44
>        actor sys_priority: 65534
>        actor port_id: 2
>        actor port_priority: 65535
>        actor key: 1
>        actor state: aggregation synchronized collecting distributing
>
>        partner sys_id: aa:88:94:85:19:43
>        partner sys_priority: 65534
>        partner port_id: 6
>        partner port_priority: 65535
>        partner key: 5
>        partner state: activity aggregation synchronized collecting 
> distributing

        This port, because the partner values are filled with specific link 
partner information that matches the vm2.txt you supplied (e.g., in vm2.txt, 
port_id 6 shows this port as its partner), is presumably correctly exchanging 
LACPDUs with the link partner.

        LACPDUs are sent as Ethernet multicasts to a specific destination, so I 
would expect that for a given configuration with similar ports, either all 
would be delivered or none would be.  This seems very curious in that only one 
port functions; perhaps something in the OVS or VXLAN forwarding is confused by 
the single MAC address for all LACPDUs (I don't know; I'm just speculating 
here).

        There is a fallback mechanism in the 802.1AX standard that would permit 
one port of an aggregator to function as an individual port when no LACPDUs are 
exchanged at all, but that doesn't appear to be the case here (as in that case 
there would be no partner sys_id, et al).

        I also notice that one bond (vm1.txt) is in LACP passive mode, and the 
other (vm2.txt) is active.  In principle this ought to be fine (the passive 
bond responding to received LACPDUs from the active bond), but I would suggest 
setting both ends to LACP active and see if that helps.  It might, and it 
shouldn't break anything.

        Also, setting the LACP rate (lacp_time) to fast instead of slow should 
make things converge more quickly for testing purposes.

        -J

---
        -Jay Vosburgh, jay.vosbu...@canonical.com
_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss

Reply via email to