Le 31/05/2022 à 08:21, Michael Ellerman a écrit : > Christophe Leroy <christophe.le...@csgroup.eu> writes: >> When KASAN is enabled, as shown by the Oops below, the 2k limit is not >> enough to allow stack dump after a stack overflow detection when >> CONFIG_DEBUG_STACKOVERFLOW is selected: >> >> do_IRQ: stack overflow: 1984 >> CPU: 0 PID: 126 Comm: systemd-udevd Not tainted 5.18.0-gentoo-PMacG4 #1 >> Call Trace: >> Oops: Kernel stack overflow, sig: 11 [#1] >> BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac >> Modules linked in: sr_mod cdrom radeon(+) ohci_pci(+) hwmon >> i2c_algo_bit drm_ttm_helper ttm drm_dp_helper snd_aoa_i2sbus >> snd_aoa_soundbus snd_pcm ehci_pci snd_timer ohci_hcd snd ssb ehci_hcd >> 8250_pci soundcore drm_kms_helper pcmcia 8250 pcmcia_core syscopyarea >> usbcore sysfillrect 8250_base sysimgblt serial_mctrl_gpio fb_sys_fops >> usb_common pkcs8_key_parser fuse drm drm_panel_orientation_quirks configfs >> CPU: 0 PID: 126 Comm: systemd-udevd Not tainted 5.18.0-gentoo-PMacG4 #1 >> NIP: c02e5558 LR: c07eb3bc CTR: c07f46a8 >> REGS: e7fe9f50 TRAP: 0000 Not tainted (5.18.0-gentoo-PMacG4) >> MSR: 00001032 <ME,IR,DR,RI> CR: 44a14824 XER: 20000000 >> >> GPR00: c07eb3bc eaa1c000 c26baea0 eaa1c0a0 00000008 00000000 c07eb3bc >> eaa1c010 >> GPR08: eaa1c0a8 04f3f3f3 f1f1f1f1 c07f4c84 44a14824 0080f7e4 00000005 >> 00000010 >> GPR16: 00000025 eaa1c154 eaa1c158 c0dbad64 00000020 fd543810 eaa1c0a0 >> eaa1c29e >> GPR24: c0dbad44 c0db8740 05ffffff fd543802 eaa1c150 c0c9a3c0 eaa1c0a0 >> c0c9a3c0 >> NIP [c02e5558] kasan_check_range+0xc/0x2b4 >> LR [c07eb3bc] format_decode+0x80/0x604 >> Call Trace: >> [eaa1c000] [c07eb3bc] format_decode+0x80/0x604 (unreliable) >> [eaa1c070] [c07f4dac] vsnprintf+0x128/0x938 >> [eaa1c110] [c07f5788] sprintf+0xa0/0xc0 >> [eaa1c180] [c0154c1c] __sprint_symbol.constprop.0+0x170/0x198 >> [eaa1c230] [c07ee71c] symbol_string+0xf8/0x260 >> [eaa1c430] [c07f46d0] pointer+0x15c/0x710 >> [eaa1c4b0] [c07f4fbc] vsnprintf+0x338/0x938 >> [eaa1c550] [c00e8fa0] vprintk_store+0x2a8/0x678 >> [eaa1c690] [c00e94e4] vprintk_emit+0x174/0x378 >> [eaa1c6d0] [c00ea008] _printk+0x9c/0xc0 >> [eaa1c750] [c000ca94] show_stack+0x21c/0x260 >> [eaa1c7a0] [c07d0bd4] dump_stack_lvl+0x60/0x90 >> [eaa1c7c0] [c0009234] __do_IRQ+0x170/0x174 >> [eaa1c800] [c0009258] do_IRQ+0x20/0x34 >> [eaa1c820] [c00045b4] HardwareInterrupt_virt+0x108/0x10c > > Is this actually caused by KASAN? There's no stack frames in there that > are KASAN related AFAICS. > > Seems like the 2K limit is never going to be enough even if KASAN is not > enabled. Presumably we just haven't noticed because we don't trigger the > check unless KASAN is enabled.
I made some test on PPC32. Without KASAN, I can call dump_stack() until the stack has at least 1120 bytes available on stack. With KASAN I can call dump_stack() until the stack has at least 2096 bytes available on stack. > >> ... >> >> Increase the limit to 3k when KASAN is enabled. >> >> While at it remove the 'inline' keywork for check_stack_overflow(). >> This function is called only once so it will be inlined regardless. > > I'd rather that was a separate change, in case it has some unintended > affect. > >> Reported-by: Erhard Furtner <erhar...@mailbox.org> >> Cc: Arnd Bergmann <a...@arndb.de> >> Signed-off-by: Christophe Leroy <christophe.le...@csgroup.eu> >> --- >> arch/powerpc/kernel/irq.c | 16 ++++++++++------ >> 1 file changed, 10 insertions(+), 6 deletions(-) >> >> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c >> index 873e6dffb868..5ff4cf69fc2f 100644 >> --- a/arch/powerpc/kernel/irq.c >> +++ b/arch/powerpc/kernel/irq.c >> @@ -53,6 +53,7 @@ >> #include <linux/vmalloc.h> >> #include <linux/pgtable.h> >> #include <linux/static_call.h> >> +#include <linux/sizes.h> >> >> #include <linux/uaccess.h> >> #include <asm/interrupt.h> >> @@ -184,7 +185,7 @@ u64 arch_irq_stat_cpu(unsigned int cpu) >> return sum; >> } >> >> -static inline void check_stack_overflow(void) >> +static void check_stack_overflow(void) >> { >> long sp; >> >> @@ -193,11 +194,14 @@ static inline void check_stack_overflow(void) >> > > Wouldn't it be cleaner to just do: > > #ifdef CONFIG_KASAN > #define STACK_CHECK_LIMIT (3 * 1024) > #else > #define STACK_CHECK_LIMIT (2 * 1024) > #endif > >> sp = current_stack_pointer & (THREAD_SIZE - 1); >> >> - /* check for stack overflow: is there less than 2KB free? */ >> - if (unlikely(sp < 2048)) { > > + if (unlikely(sp < STACK_CHECK_LIMIT)) { > > And then the code could stay as it is? > > cheers > >> - pr_err("do_IRQ: stack overflow: %ld\n", sp); >> - dump_stack(); >> - } >> + /* check for stack overflow: is there less than 2/3KB free? */ >> + if (!IS_ENABLED(KASAN) && likely(sp >= SZ_2K)) >> + return; >> + if (IS_ENABLED(KASAN) && likely(sp >= SZ_2K + SZ_1K)) >> + return; >> + >> + pr_err("do_IRQ: stack overflow: %ld\n", sp); >> + dump_stack(); >> }