Hi, On Fri, May 15, 2020 at 9:36 AM Sergey Senozhatsky <sergey.senozhat...@gmail.com> wrote: > > On (20/05/15 17:32), Sumit Garg wrote: > > > Can I please have some context what problem does this solve? > > > > You can find the problem description here [1] which leads to this fix. > > [..] > > > [1] https://lkml.org/lkml/2020/5/12/213 > > Thanks for the link. I'm slightly surprised it took so many years > to notice the addition of printk_nmi/printk_safe :)
I haven't looked at all the details, but IIUC we don't normally enter kgdb on the primary CPU through a NMI context, but the secondary ones (on x86) always do. Most things are run on the primary CPU and I think it's relatively unlikely for people to change the primary CPU (though it is possible). Probably things got worse when I changed the way "btc" worked to make it common between all architectures. See commit 9ef50a686b53 ("UPSTREAM: kdb: Fix stack crawling on 'running' CPUs that aren't the master"). Though theoretically someone could have changed masters and reproduced the problem with a simple "bt" before my patch, now a relatively normal command "btc" would tickle the problem. I didn't notice it because I work almost totally on arm/arm64 machines and they don't have NMI (yet). In general I've always wondered about why (historically) kgdb bugs have sometimes gone unnoticed for a period of time. That does seem to be changing, though, and I've seen a few longstanding bugs getting fixed recently. :-) -Doug