Hi Chris,

Could you try this to see if it fixes the problem? Thanks!

Regards,
Boqun

On Mon, Nov 02, 2020 at 01:37:41PM +0800, Boqun Feng wrote:
> Chris Wilson reported a problem spotted by check_chain_key(): a chain
> key got changed in validate_chain() because we modify the ->read in
> validate_chain() to skip checks for dependency adding, and ->read is
> taken into calculation for chain key since commit f611e8cf98ec
> ("lockdep: Take read/write status in consideration when generate
> chainkey").
> 
> Fix this by avoiding to modify ->read in validate_chain() based on two
> facts: a) since we now support recursive read lock detection, there is
> no need to skip checks for dependency adding for recursive readers, b)
> since we have a), there is only one case left (nest_lock) where we want
> to skip checks in validate_chain(), we simply remove the modification
> for ->read and rely on the return value of check_deadlock() to skip the
> dependency adding.
> 
> Reported-by: Chris Wilson <ch...@chris-wilson.co.uk>
> Signed-off-by: Boqun Feng <boqun.f...@gmail.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> ---
> Peter,
> 
> I managed to get a reproducer for the problem Chris reported, please see
> patch #2. With this patch, that problem gets fixed.
> 
> This small patchset is based on your locking/core, patch #2 actually
> relies on your "s/raw_spin/spin" changes, thanks for taking care of that
> ;-)
> 
> Regards,
> Boqun
> 
>  kernel/locking/lockdep.c | 19 +++++++++----------
>  1 file changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 3e99dfef8408..a294326fd998 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -2765,7 +2765,9 @@ print_deadlock_bug(struct task_struct *curr, struct 
> held_lock *prev,
>   * (Note that this has to be done separately, because the graph cannot
>   * detect such classes of deadlocks.)
>   *
> - * Returns: 0 on deadlock detected, 1 on OK, 2 on recursive read
> + * Returns: 0 on deadlock detected, 1 on OK, 2 if another lock with the same
> + * lock class is held but nest_lock is also held, i.e. we rely on the
> + * nest_lock to avoid the deadlock.
>   */
>  static int
>  check_deadlock(struct task_struct *curr, struct held_lock *next)
> @@ -2788,7 +2790,7 @@ check_deadlock(struct task_struct *curr, struct 
> held_lock *next)
>                * lock class (i.e. read_lock(lock)+read_lock(lock)):
>                */
>               if ((next->read == 2) && prev->read)
> -                     return 2;
> +                     continue;
>  
>               /*
>                * We're holding the nest_lock, which serializes this lock's
> @@ -3592,16 +3594,13 @@ static int validate_chain(struct task_struct *curr,
>  
>               if (!ret)
>                       return 0;
> -             /*
> -              * Mark recursive read, as we jump over it when
> -              * building dependencies (just like we jump over
> -              * trylock entries):
> -              */
> -             if (ret == 2)
> -                     hlock->read = 2;
>               /*
>                * Add dependency only if this lock is not the head
> -              * of the chain, and if it's not a secondary read-lock:
> +              * of the chain, and if the new lock introduces no more
> +              * lock dependency (because we already hold a lock with the
> +              * same lock class) nor deadlock (because the nest_lock
> +              * serializes nesting locks), see the comments for
> +              * check_deadlock().
>                */
>               if (!chain_head && ret != 2) {
>                       if (!check_prevs_add(curr, hlock))
> -- 
> 2.28.0
> 

Reply via email to