Toni Peltonen <pel...@peltzi.fi> wrote: >Previously when unbinding a slave the 802.3ad implementation only told >partner that the port is not suitable for aggregation by setting the port >aggregation state from aggregatable to individual. This is not enough. If the >physical layer still stays up and we only unbinded this port from the bond >there >is nothing in the aggregation status alone to prevent the partner from sending >traffic towards us. To ensure that the partner doesn't consider this >port at all anymore we should also disable collecting and distributing to >signal that this actor is going away. Also clear AD_STATE_SYNCHRONIZATION to >ensure partner exits collecting + distributing state. > >I have tested this behaviour againts Arista EOS switches with mlx5 cards >(physical link stays up even when interface is down) and simulated >the same situation virtually Linux <-> Linux with two network namespaces >running two veth device pairs. In both cases setting aggregation to >individual doesn't alone prevent traffic from being to sent towards this >port given that the link stays up in partners end. Partner still keeps >it's end in collecting + distributing state and continues until timeout is >reached. In most cases this means we are losing the traffic partner sends >towards our port while we wait for timeout. This is most visible with slow >periodic time (LACP rate slow). > >Other open source implementations like Open VSwitch and libreswitch, and >vendor implementations like Arista EOS, seem to disable collecting + >distributing to when doing similar port disabling/detaching/removing change. >With this patch kernel implementation would behave the same way and ensure >partner doesn't consider our actor viable anymore. > >Signed-off-by: Toni Peltonen <pel...@peltzi.fi>
Signed-off-by: Jay Vosburgh <jay.vosbu...@canonical.com> >--- >v2 changes: >* Fix typo in commit message >* Also clear AD_STATE_SYNCHRONIZATION >--- > drivers/net/bonding/bond_3ad.c | 3 +++ > 1 file changed, 3 insertions(+) > >diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c >index f43fb2f958a5..93dfcef8afc4 100644 >--- a/drivers/net/bonding/bond_3ad.c >+++ b/drivers/net/bonding/bond_3ad.c >@@ -2086,6 +2086,9 @@ void bond_3ad_unbind_slave(struct slave *slave) > aggregator->aggregator_identifier); > > /* Tell the partner that this port is not suitable for aggregation */ >+ port->actor_oper_port_state &= ~AD_STATE_SYNCHRONIZATION; >+ port->actor_oper_port_state &= ~AD_STATE_COLLECTING; >+ port->actor_oper_port_state &= ~AD_STATE_DISTRIBUTING; > port->actor_oper_port_state &= ~AD_STATE_AGGREGATION; > __update_lacpdu_from_port(port); > ad_lacpdu_send(port); >-- >2.19.0 >