From: Toni Peltonen <pel...@peltzi.fi>
Date: Tue, 27 Nov 2018 16:56:57 +0200

> Previously when unbinding a slave the 802.3ad implementation only told
> partner that the port is not suitable for aggregation by setting the port
> aggregation state from aggregatable to individual. This is not enough. If the
> physical layer still stays up and we only unbinded this port from the bond 
> there
> is nothing in the aggregation status alone to prevent the partner from sending
> traffic towards us. To ensure that the partner doesn't consider this
> port at all anymore we should also disable collecting and distributing to
> signal that this actor is going away. Also clear AD_STATE_SYNCHRONIZATION to
> ensure partner exits collecting + distributing state.
> 
> I have tested this behaviour againts Arista EOS switches with mlx5 cards
> (physical link stays up even when interface is down) and simulated
> the same situation virtually Linux <-> Linux with two network namespaces
> running two veth device pairs. In both cases setting aggregation to
> individual doesn't alone prevent traffic from being to sent towards this
> port given that the link stays up in partners end. Partner still keeps
> it's end in collecting + distributing state and continues until timeout is
> reached. In most cases this means we are losing the traffic partner sends
> towards our port while we wait for timeout. This is most visible with slow
> periodic time (LACP rate slow).
> 
> Other open source implementations like Open VSwitch and libreswitch, and
> vendor implementations like Arista EOS, seem to disable collecting +
> distributing to when doing similar port disabling/detaching/removing change.
> With this patch kernel implementation would behave the same way and ensure
> partner doesn't consider our actor viable anymore.
> 
> Signed-off-by: Toni Peltonen <pel...@peltzi.fi>
> ---
> v2 changes:
> * Fix typo in commit message
> * Also clear AD_STATE_SYNCHRONIZATION

Applied, thank you.

Reply via email to