From: Toni Peltonen <pel...@peltzi.fi> Date: Tue, 27 Nov 2018 16:56:57 +0200
> Previously when unbinding a slave the 802.3ad implementation only told > partner that the port is not suitable for aggregation by setting the port > aggregation state from aggregatable to individual. This is not enough. If the > physical layer still stays up and we only unbinded this port from the bond > there > is nothing in the aggregation status alone to prevent the partner from sending > traffic towards us. To ensure that the partner doesn't consider this > port at all anymore we should also disable collecting and distributing to > signal that this actor is going away. Also clear AD_STATE_SYNCHRONIZATION to > ensure partner exits collecting + distributing state. > > I have tested this behaviour againts Arista EOS switches with mlx5 cards > (physical link stays up even when interface is down) and simulated > the same situation virtually Linux <-> Linux with two network namespaces > running two veth device pairs. In both cases setting aggregation to > individual doesn't alone prevent traffic from being to sent towards this > port given that the link stays up in partners end. Partner still keeps > it's end in collecting + distributing state and continues until timeout is > reached. In most cases this means we are losing the traffic partner sends > towards our port while we wait for timeout. This is most visible with slow > periodic time (LACP rate slow). > > Other open source implementations like Open VSwitch and libreswitch, and > vendor implementations like Arista EOS, seem to disable collecting + > distributing to when doing similar port disabling/detaching/removing change. > With this patch kernel implementation would behave the same way and ensure > partner doesn't consider our actor viable anymore. > > Signed-off-by: Toni Peltonen <pel...@peltzi.fi> > --- > v2 changes: > * Fix typo in commit message > * Also clear AD_STATE_SYNCHRONIZATION Applied, thank you.