Hi, Sven Schnelle <sv...@linux.ibm.com> writes:
> Hi Peter, > > Peter Zijlstra <pet...@infradead.org> writes: > >> On Sun, Nov 29, 2020 at 11:31:41AM -0800, Linus Torvalds wrote: >>> On Sun, Nov 29, 2020 at 5:38 AM Thomas Gleixner <t...@linutronix.de> wrote: >>> > >>> > Yet two more places which invoke tracing from RCU disabled regions in the >>> > idle path. Similar to the entry path the low level idle functions have to >>> > be non-instrumentable. >>> >>> This really seems less than optimal. >>> >>> In particular, lookie here: >>> >>> > @@ -94,9 +94,35 @@ void __cpuidle default_idle_call(void) >>> > >>> > trace_cpu_idle(1, smp_processor_id()); >>> > stop_critical_timings(); >>> > + >>> > + /* >>> > + * arch_cpu_idle() is supposed to enable IRQs, however >>> > + * we can't do that because of RCU and tracing. >>> > + * >>> > + * Trace IRQs enable here, then switch off RCU, and have >>> > + * arch_cpu_idle() use raw_local_irq_enable(). Note that >>> > + * rcu_idle_enter() relies on lockdep IRQ state, so >>> > switch that >>> > + * last -- this is very similar to the entry code. >>> > + */ >>> > + trace_hardirqs_on_prepare(); >>> > + lockdep_hardirqs_on_prepare(_THIS_IP_); >>> > rcu_idle_enter(); >>> > + lockdep_hardirqs_on(_THIS_IP_); >>> > + >>> > arch_cpu_idle(); >>> > + >>> > + /* >>> > + * OK, so IRQs are enabled here, but RCU needs them >>> > disabled to >>> > + * turn itself back on.. funny thing is that disabling >>> > IRQs >>> > + * will cause tracing, which needs RCU. Jump through >>> > hoops to >>> > + * make it 'work'. >>> > + */ >>> > + raw_local_irq_disable(); >>> > + lockdep_hardirqs_off(_THIS_IP_); >>> > rcu_idle_exit(); >>> > + lockdep_hardirqs_on(_THIS_IP_); >>> > + raw_local_irq_enable(); >>> > + >>> > start_critical_timings(); >>> > trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id()); >>> > } >>> >>> And look at what the code generation for the idle exit path is when >>> lockdep isn't even on. >> >> Agreed. >> >> The idea was to flip all of arch_cpu_idle() to not enable interrupts. >> >> This is suboptimal for things like x86 where arch_cpu_idle() is >> basically STI;HLT, but x86 isn't likely to actually use this code path >> anyway, given all the various cpuidle drivers it has. >> >> Many of the other archs are now doing things like arm's: >> wfi();raw_local_irq_enable(). >> >> Doing that tree-wide interrupt-state flip was something I didn't want to >> do at this late a stage, the chanse of messing that up is just too high. >> >> After that I need to go look at flipping cpuidle, which is even more >> 'interesting'. cpuidle_enter() has the exact same semantics, and this is >> the code path that x86 actually uses, and here it's inconsitent at best. > > On s390 this introduces the following splat: > [..] I sent you the wrong backtrace. This is the correct one: [ 0.667491] smp: Bringing up secondary CPUs ... [ 0.670262] random: get_random_bytes called from __warn+0x12a/0x160 with crng_init=1 [ 0.670280] ------------[ cut here ]------------ [ 0.670288] WARNING: CPU: 1 PID: 0 at kernel/rcu/tree.c:1054 rcu_irq_enter+0x7e/0xa8 [ 0.670293] Modules linked in: [ 0.670299] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 5.10.0-rc6 #2263 [ 0.670304] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0) [ 0.670309] Krnl PSW : 0404d00180000000 0000000000d8a8da (rcu_irq_enter+0x82/0xa8) [ 0.670318] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3 [ 0.670325] Krnl GPRS: 0000000000000000 0000000080000002 0000000000000001 000000000101fcee [ 0.670331] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.670337] 000003e00029ff48 0000000000000000 00000000017212d8 0000000000000001 [ 0.670343] 0000000005ba0100 00000000000324bb 000003e00029fe40 000003e00029fe10 [ 0.670358] Krnl Code: 0000000000d8a8ca: ec180013017e cij %r1,1,8,0000000000d8a8f0 [ 0.670358] 0000000000d8a8d0: ecb80005007e cij %r11,0,8,0000000000d8a8da [ 0.670358] #0000000000d8a8d6: af000000 mc 0,0 [ 0.670358] >0000000000d8a8da: ebbff0a00004 lmg %r11,%r15,160(%r15) [ 0.670358] 0000000000d8a8e0: c0f4ffffff68 brcl 15,0000000000d8a7b0 [ 0.670358] 0000000000d8a8e6: c0e5000038c1 brasl %r14,0000000000d91a68 [ 0.670358] 0000000000d8a8ec: a7f4ffdc brc 15,0000000000d8a8a4 [ 0.670358] 0000000000d8a8f0: c0e5000038bc brasl %r14,0000000000d91a68 [ 0.670392] Call Trace: [ 0.670396] [<0000000000d8a8da>] rcu_irq_enter+0x82/0xa8 [ 0.670401] [<0000000000157f9a>] irq_enter+0x22/0x30 [ 0.670404] [<000000000010e51c>] do_IRQ+0x64/0xd0 [ 0.670408] [<0000000000d9a65a>] ext_int_handler+0x18e/0x194 [ 0.670412] [<0000000000d9a6a0>] psw_idle+0x40/0x48 [ 0.670416] ([<0000000000104202>] enabled_wait+0x22/0xf0) [ 0.670419] [<00000000001046e2>] arch_cpu_idle+0x22/0x38 [ 0.670423] [<0000000000d986cc>] default_idle_call+0x74/0xd8 [ 0.670427] [<000000000019a94a>] do_idle+0xf2/0x1b0 [ 0.670431] [<000000000019ac7e>] cpu_startup_entry+0x36/0x40 [ 0.670435] [<0000000000118b9a>] smp_start_secondary+0x82/0x88 [ 0.670438] INFO: lockdep is turned off. [ 0.670441] Last Breaking-Event-Address: [ 0.670444] [<0000000000157f94>] irq_enter+0x1c/0x30 [ 0.670447] irq event stamp: 19 [ 0.670451] hardirqs last enabled at (19): [<0000000000d98688>] default_idle_call+0x30/0xd8 [ 0.670455] hardirqs last disabled at (18): [<000000000019a9c2>] do_idle+0x16a/0x1b0 [ 0.670459] softirqs last enabled at (0): [<000000000014c3be>] copy_process+0x48e/0x14c0 [ 0.670463] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 0.670466] ---[ end trace cba3783aedff6f79 ]--- [ 0.670685] smp: Brought up 1 node, 2 CPUs [ 0.676300] [ 0.676301] ============================= [ 0.676302] WARNING: suspicious RCU usage [ 0.676303] 5.10.0-rc6 #2263 Not tainted [ 0.676303] ----------------------------- [ 0.676304] include/trace/events/lock.h:13 suspicious rcu_dereference_check() usage! [ 0.676305] [ 0.676306] other info that might help us debug this: [ 0.676306] [ 0.676307] [ 0.676307] rcu_scheduler_active = 1, debug_locks = 1 [ 0.676308] RCU used illegally from extended quiescent state! [ 0.676309] no locks held by swapper/1/0. [ 0.676309] [ 0.676310] stack backtrace: [ 0.676310] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.10.0-rc6 #2263 [ 0.676311] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0) [ 0.676312] Call Trace: [ 0.676312] [<0000000000d726f8>] show_stack+0x90/0xf8 [ 0.676313] [<0000000000d855f2>] dump_stack+0xa2/0xd8 [ 0.676314] [<00000000001d216e>] trace_lock_acquire+0x1ce/0x1d8 [ 0.676314] [<00000000001d7c44>] lock_acquire+0x44/0x88 [ 0.676315] [<0000000000d98788>] _raw_spin_lock+0x58/0xa8 [ 0.676316] [<00000000001e4e94>] vprintk_emit+0x74/0x1a8 [ 0.676316] [<00000000001e4ffe>] vprintk_default+0x36/0x48 [ 0.676317] [<0000000000d76526>] printk+0x46/0x58 [ 0.676318] [<00000000009ae188>] report_bug+0x110/0x130 [ 0.676318] [<000000000010215c>] monitor_event_exception+0x44/0xc0 [ 0.676319] [<0000000000d9a10e>] pgm_check_handler+0x1da/0x238 [ 0.676320] [<0000000000d8a8da>] rcu_irq_enter+0x82/0xa8 [ 0.676321] ------------[ cut here ]------------ [ 0.676321] DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled()) [ 0.676322] WARNING: CPU: 1 PID: 0 at kernel/locking/lockdep.c:5281 check_flags.part.0+0x1cc/0x208 [ 0.676323] Modules linked in: [ 0.676324] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.10.0-rc6 #2263 [ 0.676325] Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0) [ 0.676325] Krnl PSW : 0404c00180000000 00000000001d79d0 (check_flags.part.0+0x1d0/0x208) [ 0.676327] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 [ 0.676329] Krnl GPRS: 00000000ffffffe5 000000000116dbd4 000000000000002f 0000000000fe8c00 [ 0.676330] 0000000000000000 00000000009ad0e8 0000000000000000 0000000000000000 [ 0.676330] 00000000012d7f20 0000000000000000 040003e000000001 000000000257f310 [ 0.676331] 0000000005ba0100 00000000011725b0 00000000001d79cc 000003e00029f638 [ 0.676332] Krnl Code: 00000000001d79c0: c02000708989 larl %r2,0000000000fe8cd2 [ 0.676333] 00000000001d79c6: c0e5005cde09 brasl %r14,0000000000d735d8 [ 0.676334] #00000000001d79cc: af000000 mc 0,0 [ 0.676335] >00000000001d79d0: c0200070d844 larl %r2,0000000000ff2a58 [ 0.676337] 00000000001d79d6: c0e5005cf585 brasl %r14,0000000000d764e0 [ 0.676338] 00000000001d79dc: a7f4ff3f brc 15,00000000001d785a [ 0.676339] 00000000001d79e0: c0e5005dd044 brasl %r14,0000000000d91a68 [ 0.676340] 00000000001d79e6: a7f4ff6f brc 15,00000000001d78c4 [ 0.676341] Call Trace: [ 0.676342] [<00000000001d79d0>] check_flags.part.0+0x1d0/0x208 [ 0.676343] ([<00000000001d79cc>] check_flags.part.0+0x1cc/0x208) [ 0.676344] [<00000000001d7a9e>] lock_acquire.part.0+0x96/0x1f8 [ 0.676344] [<00000000001d7c7c>] lock_acquire+0x7c/0x88 [ 0.676345] [<000000000027bce4>] __is_insn_slot_addr+0x5c/0x150 [ 0.676346] [<000000000017dbf2>] kernel_text_address+0x11a/0x130 [ 0.676346] [<000000000017dc2e>] __kernel_text_address+0x26/0x70 [ 0.676347] [<000000000011843c>] unwind_next_frame+0x104/0x1c0 [ 0.676348] [<0000000000d72738>] show_stack+0xd0/0xf8 [ 0.676348] [<0000000000d855f2>] dump_stack+0xa2/0xd8 [ 0.676349] [<00000000001d216e>] trace_lock_acquire+0x1ce/0x1d8 [ 0.676350] [<00000000001d7c44>] lock_acquire+0x44/0x88 [ 0.676350] [<0000000000d98788>] _raw_spin_lock+0x58/0xa8 [ 0.676351] [<00000000001e4e94>] vprintk_emit+0x74/0x1a8 [ 0.676352] [<00000000001e4ffe>] vprintk_default+0x36/0x48 [ 0.676352] [<0000000000d76526>] printk+0x46/0x58 [ 0.676353] [<00000000009ae188>] report_bug+0x110/0x130 [ 0.676354] [<000000000010215c>] monitor_event_exception+0x44/0xc0 [ 0.676354] [<0000000000d9a10e>] pgm_check_handler+0x1da/0x238 [ 0.676355] [<0000000000d8a8da>] rcu_irq_enter+0x82/0xa8 [ 0.676356] [<0000000000157f9a>] irq_enter+0x22/0x30 [ 0.676356] [<000000000010e51c>] do_IRQ+0x64/0xd0 [ 0.676357] [<0000000000d9a65a>] ext_int_handler+0x18e/0x194 [ 0.676358] [<0000000000d9a6a0>] psw_idle+0x40/0x48 [ 0.676358] ([<0000000000104202>] enabled_wait+0x22/0xf0) [ 0.676359] [<00000000001046e2>] arch_cpu_idle+0x22/0x38 [ 0.676360] [<0000000000d986cc>] default_idle_call+0x74/0xd8 [ 0.676360] [<000000000019a94a>] do_idle+0xf2/0x1b0 [ 0.676361] [<000000000019ac7e>] cpu_startup_entry+0x36/0x40 [ 0.676362] [<0000000000118b9a>] smp_start_secondary+0x82/0x88 [ 0.676362] INFO: lockdep is turned off. [ 0.676363] Last Breaking-Event-Address: [ 0.676364] [<0000000000d9c4d0>] __s390_indirect_jump_r14+0x0/0xc [ 0.676364] irq event stamp: 19 [ 0.676365] hardirqs last enabled at (19): [<0000000000d98688>] default_idle_call+0x30/0xd8 [ 0.676366] hardirqs last disabled at (18): [<000000000019a9c2>] do_idle+0x16a/0x1b0 [ 0.676367] softirqs last enabled at (0): [<000000000014c3be>] copy_process+0x48e/0x14c0 [ 0.676367] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 0.676368] ---[ end trace cba3783aedff6f78 ]--- [ 0.676369] possible reason: unannotated irqs-off. [ 0.676369] irq event stamp: 19 [ 0.676370] hardirqs last enabled at (19): [<0000000000d98688>] default_idle_call+0x30/0xd8 [ 0.676371] hardirqs last disabled at (18): [<000000000019a9c2>] do_idle+0x16a/0x1b0 [ 0.676372] softirqs last enabled at (0): [<000000000014c3be>] copy_process+0x48e/0x14c0 [ 0.676372] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 0.676373] [<0000000000157f9a>] irq_enter+0x22/0x30 [ 0.676374] [<000000000010e51c>] do_IRQ+0x64/0xd0 [ 0.676374] [<0000000000d9a65a>] ext_int_handler+0x18e/0x194 [ 0.676375] [<0000000000d9a6a0>] psw_idle+0x40/0x48 [ 0.676376] ([<0000000000104202>] enabled_wait+0x22/0xf0) [ 0.676376] [<00000000001046e2>] arch_cpu_idle+0x22/0x38 [ 0.676377] [<0000000000d986cc>] default_idle_call+0x74/0xd8 [ 0.676378] [<000000000019a94a>] do_idle+0xf2/0x1b0 [ 0.676378] [<000000000019ac7e>] cpu_startup_entry+0x36/0x40 [ 0.676379] [<0000000000118b9a>] smp_start_secondary+0x82/0x88 [ 0.676379] INFO: lockdep is turned off.