> -----Original Message----- > From: Intel-wired-lan <[email protected]> On Behalf Of > Przemek Kitszel > Sent: Friday, April 4, 2025 12:23 PM > To: [email protected]; Nguyen, Anthony L > <[email protected]> > Cc: [email protected]; Zaki, Ahmed <[email protected]>; > Loktionov, Aleksandr <[email protected]>; Stanislav Fomichev > <[email protected]>; Kitszel, Przemyslaw <[email protected]>; > Keller, Jacob E <[email protected]>; Jakub Kicinski <[email protected]> > Subject: [Intel-wired-lan] [PATCH iwl-net 6/6] iavf: get rid of the crit lock > > Get rid of the crit lock. > That frees us from the error prone logic of try_locks. > > Thanks to netdev_lock() by Jakub it is now easy, and in most cases we were > protected by it already - replace crit lock by netdev lock when it was not the > case. > > Lockdep reports that we should cancel the work under crit_lock [splat1], and > that was the scheme we have mostly followed since [1] by Slawomir. > But when that is done we still got into deadlocks [splat2]. So instead we > should look at the bigger problem, namely "weird locking/scheduling" > of the iavf. The first step to fix that is to remove the crit lock. > I will followup with a -next series that simplifies scheduling/tasks. > > Cancel the work without netdev lock (weird unlock+lock scheme), to fix the > [splat2] (which would be totally ugly if we would kept the crit lock). > > Extend protected part of iavf_watchdog_task() to include scheduling more > work. > > Note that the removed comment in iavf_reset_task() was misplaced, it > belonged to inside of the removed if condition, so it's gone now. > > [splat1] - w/o this patch - The deadlock during VF removal: > WARNING: possible circular locking dependency detected > sh/3825 is trying to acquire lock: > ((work_completion)(&(&adapter->watchdog_task)->work)){+.+.}-{0:0}, at: > start_flush_work+0x1a1/0x470 > but task is already holding lock: > (&adapter->crit_lock){+.+.}-{4:4}, at: iavf_remove+0xd1/0x690 [iavf] > which lock already depends on the new lock. > > [splat2] - when cancelling work under crit lock, w/o this series, > see [2] for the band aid attempt > WARNING: possible circular locking dependency detected > sh/3550 is trying to acquire lock: > ((wq_completion)iavf){+.+.}-{0:0}, at: touch_wq_lockdep_map+0x26/0x90 > but task is already holding lock: > (&dev->lock){+.+.}-{4:4}, at: iavf_remove+0xa6/0x6e0 [iavf] > which lock already depends on the new lock. > > [1] fc2e6b3b132a ("iavf: Rework mutexes for better synchronisation") [2] > https://github.com/pkitszel/linux/commit/52dddbfc2bb60294083f5711a15 > 8a > Fixes: d1639a17319b ("iavf: fix a deadlock caused by rtnl and driver's lock > circular dependencies") > Signed-off-by: Przemek Kitszel <[email protected]> > --- > CC: Jacob Keller <[email protected]> > CC: Jakub Kicinski <[email protected]> > CC: Ahmed Zaki <[email protected]> > CC: Michal Schmidt <[email protected]> > CC: Aleksandr Loktionov <[email protected]> > --- > drivers/net/ethernet/intel/iavf/iavf.h | 1 - > .../net/ethernet/intel/iavf/iavf_ethtool.c | 23 +-- > drivers/net/ethernet/intel/iavf/iavf_main.c | 165 ++++-------------- > 3 files changed, 38 insertions(+), 151 deletions(-) > > diff --git a/drivers/net/ethernet/intel/iavf/iavf.h > b/drivers/net/ethernet/intel/iavf/iavf.h > index 9de3e0ba3731..f7a98ff43a57 100644 > --- a/drivers/net/ethernet/intel/iavf/iavf.h > +++ b/drivers/net/ethernet/intel/iavf/iavf.h > @@ -268,7 +268,6 @@ struct iavf_adapter {
Tested-by: Rafal Romanowski <[email protected]>
