Toni Peltonen <pel...@peltzi.fi> wrote:

>Previously when unbinding a slave the 802.3ad implementation only told
>partner that the port is not suitable for aggregation by setting the port
>aggregation state from aggregatable to individual. This is not enough. If the
>physical layer still stays up and we only unbinded this port from the bond 
>there
>is nothing in the aggregation status alone to prevent the partner from sending
>traffic towards us. To ensure that the partner doesn't consider this
>port at all anymore we should also disable collecting and distributing to
>signal that this actor is going away. Also clear AD_STATE_SYNCHRONIZATION to
>ensure partner exits collecting + distributing state.
>
>I have tested this behaviour againts Arista EOS switches with mlx5 cards
>(physical link stays up even when interface is down) and simulated
>the same situation virtually Linux <-> Linux with two network namespaces
>running two veth device pairs. In both cases setting aggregation to
>individual doesn't alone prevent traffic from being to sent towards this
>port given that the link stays up in partners end. Partner still keeps
>it's end in collecting + distributing state and continues until timeout is
>reached. In most cases this means we are losing the traffic partner sends
>towards our port while we wait for timeout. This is most visible with slow
>periodic time (LACP rate slow).
>
>Other open source implementations like Open VSwitch and libreswitch, and
>vendor implementations like Arista EOS, seem to disable collecting +
>distributing to when doing similar port disabling/detaching/removing change.
>With this patch kernel implementation would behave the same way and ensure
>partner doesn't consider our actor viable anymore.
>
>Signed-off-by: Toni Peltonen <pel...@peltzi.fi>

Signed-off-by: Jay Vosburgh <jay.vosbu...@canonical.com>

>---
>v2 changes:
>* Fix typo in commit message
>* Also clear AD_STATE_SYNCHRONIZATION
>---
> drivers/net/bonding/bond_3ad.c | 3 +++
> 1 file changed, 3 insertions(+)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index f43fb2f958a5..93dfcef8afc4 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -2086,6 +2086,9 @@ void bond_3ad_unbind_slave(struct slave *slave)
>                  aggregator->aggregator_identifier);
> 
>       /* Tell the partner that this port is not suitable for aggregation */
>+      port->actor_oper_port_state &= ~AD_STATE_SYNCHRONIZATION;
>+      port->actor_oper_port_state &= ~AD_STATE_COLLECTING;
>+      port->actor_oper_port_state &= ~AD_STATE_DISTRIBUTING;
>       port->actor_oper_port_state &= ~AD_STATE_AGGREGATION;
>       __update_lacpdu_from_port(port);
>       ad_lacpdu_send(port);
>-- 
>2.19.0
>

Reply via email to