On Thu, Apr 09, 2015 at 06:57:38AM +0000, Naoya Horiguchi wrote: > Yes, I did see it at fisrt, so I did two tweaks for the testing: > > 1) to fix qemu code. I think that current mce injection code of qemu is buggy, > because when we try to inject MCE in broadcast mode, all injections other than > the first one are done with MCG_STATUS_MCIP (see > cpu_x86_inject_mce()@target-i386/helper.c.) > It looks to me a bug because this means that every (broadcast mode) MCE > injection > causes triplet-fault, which seems not mimicking the real HW behavior. > > 2) to insert the delay (for a few seconds) into kdump_nmi_callback() before > disable_local_APIC(). This is because MCE interrupt is delivered to CPUs in > different manners in qemu and in bare metal. Bare metals do respond to MCE > interrupts after disable_local_APIC(), but qemu not.
Lemme take a look at that. > Unfortunately our testing (~15000 times kdump/reboot cycles) with the debug > kernel on bare metals didn't reproduce the problem yet, but I believe that > the above testing on qemu should hit a target. If only APEI EINJ could be taught to do delayed injection, regardless of OS kernel running. Tony, is something like that even possible at all? -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/