On Fri, Nov 06, 2020 at 06:04:15AM +0000, Shinichiro Kawasaki wrote:
> Greetings,
> 
> I observe "WARNING: can't access registers at asm_common_interrupt+0x1e/0x40"
> in my kernel test system repeatedly, which is printed by unwind_next_frame() 
> in
> "arch/x86/kernel/unwind_orc.c". Syzbot already reported that [1]. Similar
> warning was reported and discussed [2], but I suppose the cause is not yet
> clarified.
> 
> The warning was observed with v5.10-rc2 and older tags. I bisected and found
> that the commit 044d0d6de9f5 ("lockdep: Only trace IRQ edges") in v5.9-rc3
> triggered the warning. Reverting that from 5.10-rc2, the warning disappeared.
> May I ask comment by expertise on CC how this commit can relate to the 
> warning?
> 
> The test condition to reproduce the warning is rather unique (blktests,
> dm-linear and ZNS device emulation by QEMU). If any action is suggested for
> further analysis, I'm willing to take it with my test system.
> 
> Wish this report helps.
> 
> [1] https://lkml.org/lkml/2020/9/6/231
> [2] https://lkml.org/lkml/2020/9/8/1538

Shin'ichiro,

Thanks for all the data.  It looks like the ORC unwinder is getting
confused by paravirt patching (with runtime-patched pushf/pop changing
the stack layout).

<user interrupt>
        exit_to_user_mode_prepare()
                exit_to_user_mode_loop()
                        local_irq_disable_exit_to_user()
                                local_irq_disable()
                                        raw_irqs_disabled()
                                                arch_irqs_disabled()
                                                        arch_local_save_flags()
                                                                pushfq
                                                                <another 
interrupt>

Objtool doesn't know about the pushf/pop paravirt patch, so ORC gets
confused by the changed stack layout.

I'm thinking we either need to teach objtool how to deal with
save_fl/restore_fl patches, or we need to just get rid of those nasty
patches somehow.  Peter, any thoughts?

It looks like 044d0d6de9f5 ("lockdep: Only trace IRQ edges") is making
the problem more likely, by adding the irqs_disabled() check for every
local_irq_disable().

Also - Peter, Nicholas - is that irqs_disabled() check really necessary
in local_irq_disable()?  Presumably irqs would typically be be enabled
before calling it?

-- 
Josh

Reply via email to