> But sending signals from #MC context is definitely a bad idea. I think > we had addressed this with irq_work at some point but my memory is very > hazy.
We added code for recoverable errors to get out of the MC context before trying to lookup the page and send the signal. Bottom of do_machine_check(): if (cfg->tolerant < 3) { if (no_way_out) mce_panic("Fatal machine check on current CPU", &m, msg); if (worst == MCE_AR_SEVERITY) { /* schedule action before return to userland */ mce_save_info(m.addr, m.mcgstatus & MCG_STATUS_RIPV); set_thread_flag(TIF_MCE_NOTIFY); } else if (kill_it) { force_sig(SIGBUS, current); } } That TIF_MCE_NOTIFY prevents the return to user mode, and we end up in mce_notify_process(). The "force_sig()" there is legacy code - and perhaps should just move off to mce_notify_process() too (need to save "worst" so it will know what to do). -Tony