On Wed, Dec 02, 2020 at 16:39, Jay Vosburgh <jay.vosbu...@canonical.com> wrote:
> Tobias Waldekranz <tob...@waldekranz.com> wrote:
>>I could look at hoisting up the linking op before the first
>>notification. My main concern is that this is a new subsystem to me, so
>>I am not sure how to determine the adequate test coverage for a change
>>like this.
>>
>>Another option would be to drop this change from this series and do it
>>separately. It would be nice to have both team and bond working though.
>>
>>Not sure why I am the first to run into this. Presumably the mlxsw LAG
>>offloading would be affected in the same way. Maybe their main use-case
>>is LACP.
>
>       I'm not sure about mlxsw specifically, but in the configurations
> I see, LACP is by far the most commonly used mode, with active-backup a
> distant second.  I can't recall the last time I saw a production
> environment using balance-xor.

Makes sense. We (Westermo) have a few customers using static LAGs, so it
does happen. That said, LACP is way more common for us as well.

>       I think that in the perfect world there should be exactly one
> such notification, and occurring in the proper sequence.  A quick look
> at the kernel consumers of the NETDEV_CHANGELOWERSTATE event (mlx5,
> mlxsw, and nfp, looks like) suggests that those shouldn't have an issue.
>
>       In user space, however, there are daemons that watch the events,
> and may rely on the current ordering.  Some poking around reveals odd
> bugs in user space when events are rearranged, so I think the prudent
> thing is to not mess with what's there now, and just add the one event
> here (i.e., apply your patch as-is).

This is exactly the sort of thing I was worried about. Thank you so much
for testing it!

Reply via email to