On Tue, Nov 27, 2012 at 03:20:11PM -0800, Prasad Koya wrote: > I'm definitely seeing above lockup with 2.6.38.8. In 3.2 and up kernel > nmi_shootdown_cpus() replaced register_die_notifier() with > register_nmi_handler() which doesn't call vmalloc_sync_all. If I patch > my 2.6.38.8 so it behaves as 3.2 in this regard ie., skip > vmalloc_sync_all, I don't see any issue.
I don't know what vmalloc_sync_all really does (or sync_global_pgds for that matter), but it seems like it is trying to sync up all the processors by flush tlbs. If that is the case then that is going to cause issues with NMI or at least your use of lkdtm. The reason why is with NMIs and your use of lkdtm, IPIs are blocked (nmi has higher priority and lkdtm disabled interrupts). This usually causes hangs with any mm flushing (tlbs, pgds, etc). I think that is why Andrea put in one of his changelog that it is forbidden to grab the page_lock with irqs disabled (namely IPIs blocked). > > So my question is, is it safe to bypass calling vmalloc_sync_all() as > part of setting up NMI handler? Maybe with a patch like below: > > --- linux-2.6.38.orig/kernel/notifier.c > +++ linux-2.6.38/kernel/notifier.c > @@ -574,7 +574,8 @@ int notrace __kprobes notify_die(enum di > > int register_die_notifier(struct notifier_block *nb) > { > - vmalloc_sync_all(); > + if (!oops_in_progress) > + vmalloc_sync_all(); > return atomic_notifier_chain_register(&die_chain, nb); > } > EXPORT_SYMBOL_GPL(register_die_notifier); You are dying anyway, so this patch is probably ok and will get you by for now. I am not sure how important sync'ing the pgds are at this point. I wouldn't recommend this for 3.7 or anything, but if you want to use this as a private patch I can't see anything wrong with it. I think the only caller that calls register_die_notifier within an oops is crash_kexec anyway. Cheers, Don -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/