On Wed, May 27, 2020 at 11:55:58AM +0530, Sumit Garg wrote:
> While rounding up CPUs via NMIs, its possible that a rounded up CPU

This problem does not just impact NMI roundup (breakpoints, including
implicit breakpoint-on-oops can have the same effect).


> maybe holding a console port lock leading to kgdb master CPU stuck in
> a deadlock during invocation of console write operations. So in order
> to avoid such a deadlock, enable oops_in_progress prior to invocation
> of console handlers.
> 
> Suggested-by: Petr Mladek <pmla...@suse.com>
> Signed-off-by: Sumit Garg <sumit.g...@linaro.org>
> ---
>  kernel/debug/kdb/kdb_io.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> index 349dfcc..f848482 100644
> --- a/kernel/debug/kdb/kdb_io.c
> +++ b/kernel/debug/kdb/kdb_io.c
> @@ -566,7 +566,17 @@ static void kdb_msg_write(char *msg, int msg_len)
>       for_each_console(c) {
>               if (!(c->flags & CON_ENABLED))
>                       continue;
> +             /*
> +              * While rounding up CPUs via NMIs, its possible that

Ditto.

> +              * a rounded up CPU maybe holding a console port lock
> +              * leading to kgdb master CPU stuck in a deadlock during
> +              * invocation of console write operations. So in order
> +              * to avoid such a deadlock, enable oops_in_progress
> +              * prior to invocation of console handlers.

Actually looking at this comment as a whole I think it spends to many
words on what and not enough on why (e.g. what the tradeoffs are and
why we are not using bust_spinlocks() which would be a more obvious
approach).

  Set oops_in_progress to encourage the console drivers to disregard
  their internal spin locks: in the current calling context
  the risk of deadlock is a bigger problem than risks due to
  re-entering the console driver. We operate directly on
  oops_in_progress rather than using bust_spinlocks() because
  the calls bust_spinlocks() makes on exit are not appropriate
  for this calling context.


Daniel.


> +              */
> +             ++oops_in_progress;
>               c->write(c, msg, msg_len);
> +             --oops_in_progress;
>               touch_nmi_watchdog();
>       }
>  }
> -- 
> 2.7.4
> 

Reply via email to