On Tue, 2025-05-06 at 10:12 -0700, Stanislav Fomichev wrote:
> On 05/06, Cosmin Ratiu wrote:
> > __netdev_update_features() expects the netdevice to be ops-locked,
> > but
> > it gets called recursively on the lower level netdevices to sync
> > their
> > features, and nothing locks those.
> > 
> > This commit fixes that, with the assumption that it shouldn't be
> > possible
> > for both higher-level and lover-level netdevices to require the
> > instance
> > lock, because that would lead to lock dependency warnings.
> > 
> > Without this, playing with higher level (e.g. vxlan) netdevices on
> > top
> > of netdevices with instance locking enabled can run into issues:
> 
> Mentioning vxlan is a bit confusing here; it shouldn't let you flip
> lro (I
> think). Which upper are you testing against?

It is vxlan, but LRO is just a red herring in this case, 
mlx5e_set_features calls every feature handler in turn, and this is
just the example I picked from the sea of stack traces.

> 
> Trying to understand if we can cover this case in the selftests.
> netdevsim also doesn't expose F_LRO feature... (yet?)

I see you found a way with teaming, but I think in essence a sequence
of commands that makes __netdev_update_features call itself recursively
once on the lower dev will trigger the netdev_ops_assert_locked on the
lower dev.

Thanks for the review!

Cosmin.

Reply via email to