> -----Original Message-----
> From: Intel-wired-lan <[email protected]> On Behalf Of
> Przemek Kitszel
> Sent: Friday, April 4, 2025 12:23 PM
> To: [email protected]; Nguyen, Anthony L
> <[email protected]>
> Cc: [email protected]; Zaki, Ahmed <[email protected]>;
> Loktionov, Aleksandr <[email protected]>; Stanislav Fomichev
> <[email protected]>; Kitszel, Przemyslaw <[email protected]>;
> Keller, Jacob E <[email protected]>; Jakub Kicinski <[email protected]>
> Subject: [Intel-wired-lan] [PATCH iwl-net 6/6] iavf: get rid of the crit lock
> 
> Get rid of the crit lock.
> That frees us from the error prone logic of try_locks.
> 
> Thanks to netdev_lock() by Jakub it is now easy, and in most cases we were
> protected by it already - replace crit lock by netdev lock when it was not the
> case.
> 
> Lockdep reports that we should cancel the work under crit_lock [splat1], and
> that was the scheme we have mostly followed since [1] by Slawomir.
> But when that is done we still got into deadlocks [splat2]. So instead we
> should look at the bigger problem, namely "weird locking/scheduling"
> of the iavf. The first step to fix that is to remove the crit lock.
> I will followup with a -next series that simplifies scheduling/tasks.
> 
> Cancel the work without netdev lock (weird unlock+lock scheme), to fix the
> [splat2] (which would be totally ugly if we would kept the crit lock).
> 
> Extend protected part of iavf_watchdog_task() to include scheduling more
> work.
> 
> Note that the removed comment in iavf_reset_task() was misplaced, it
> belonged to inside of the removed if condition, so it's gone now.
> 
> [splat1] - w/o this patch - The deadlock during VF removal:
>      WARNING: possible circular locking dependency detected
>      sh/3825 is trying to acquire lock:
>       ((work_completion)(&(&adapter->watchdog_task)->work)){+.+.}-{0:0}, at:
> start_flush_work+0x1a1/0x470
>           but task is already holding lock:
>       (&adapter->crit_lock){+.+.}-{4:4}, at: iavf_remove+0xd1/0x690 [iavf]
>           which lock already depends on the new lock.
> 
> [splat2] - when cancelling work under crit lock, w/o this series,
>          see [2] for the band aid attempt
>     WARNING: possible circular locking dependency detected
>     sh/3550 is trying to acquire lock:
>     ((wq_completion)iavf){+.+.}-{0:0}, at: touch_wq_lockdep_map+0x26/0x90
>         but task is already holding lock:
>     (&dev->lock){+.+.}-{4:4}, at: iavf_remove+0xa6/0x6e0 [iavf]
>         which lock already depends on the new lock.
> 
> [1] fc2e6b3b132a ("iavf: Rework mutexes for better synchronisation") [2]
> https://github.com/pkitszel/linux/commit/52dddbfc2bb60294083f5711a15
> 8a
> Fixes: d1639a17319b ("iavf: fix a deadlock caused by rtnl and driver's lock
> circular dependencies")
> Signed-off-by: Przemek Kitszel <[email protected]>
> ---
> CC: Jacob Keller <[email protected]>
> CC: Jakub Kicinski <[email protected]>
> CC: Ahmed Zaki <[email protected]>
> CC: Michal Schmidt <[email protected]>
> CC: Aleksandr Loktionov <[email protected]>
> ---
>  drivers/net/ethernet/intel/iavf/iavf.h        |   1 -
>  .../net/ethernet/intel/iavf/iavf_ethtool.c    |  23 +--
>  drivers/net/ethernet/intel/iavf/iavf_main.c   | 165 ++++--------------
>  3 files changed, 38 insertions(+), 151 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/iavf/iavf.h
> b/drivers/net/ethernet/intel/iavf/iavf.h
> index 9de3e0ba3731..f7a98ff43a57 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf.h
> +++ b/drivers/net/ethernet/intel/iavf/iavf.h
> @@ -268,7 +268,6 @@ struct iavf_adapter {


Tested-by: Rafal Romanowski <[email protected]>


Reply via email to