On Wed, Aug 05, 2020 at 07:31PM +0200, Marco Elver wrote: ... > Oh well, it seems that KCSAN on syzbot still crashes even with this > "fix". It's harder to reproduce though, and I don't have a clear > reproducer other than "fuzz the kernel" right now. I think the new IRQ > state tracking code is still not compatible with KCSAN, even though we > thought it would be. Most likely there are still ways to get recursion > lockdep->KCSAN. An alternative would be to deal with the recursion > like we did before, instead of trying to squash all of it. I'll try to > investigate -- Peter, if you have ideas, help is appreciated.
Testing my hypothesis that raw then nested non-raw local_irq_save/restore() breaks IRQ state tracking -- see the reproducer below. This is at least 1 case I can think of that we're bound to hit. Thanks, -- Marco ------ >8 ------ diff --git a/init/main.c b/init/main.c index 15bd0efff3df..0873319dcff4 100644 --- a/init/main.c +++ b/init/main.c @@ -1041,6 +1041,22 @@ asmlinkage __visible void __init start_kernel(void) sfi_init_late(); kcsan_init(); + /* DEBUG CODE */ + lockdep_assert_irqs_enabled(); /* Pass. */ + { + unsigned long flags1; + raw_local_irq_save(flags1); + { + unsigned long flags2; + lockdep_assert_irqs_enabled(); /* Pass - expectedly blind. */ + local_irq_save(flags2); + lockdep_assert_irqs_disabled(); /* Pass. */ + local_irq_restore(flags2); + } + raw_local_irq_restore(flags1); + } + lockdep_assert_irqs_enabled(); /* FAIL! */ + /* Do the rest non-__init'ed, we're now alive */ arch_call_rest_init();