On Wed, Nov 19, 2014 at 11:34:10PM +0000, Luck, Tony wrote: > The SDM has this to say about EN=0 (in section 15.10.4.1 of volume 3B): > > When the EN flag is zero but the VAL and UC flags are one in > the IA32_MCi_STATUS register, the reported uncorrected error > in this bank is not enabled. As uncorrected errors with the > EN flag = 0 are not the source of machine check exceptions, > the MCE handler should log and clear non-enabled errors when > the S bit is set and should continue searching for enabled > errors from the other IA32_MCi_STATUS registers. Note that > when IA32_MCG_CAP [24] is 0, any uncorrected error condition > (VAL =1 and UC=1) including the one with the EN flag cleared > are fatal and the handler must signal the operating system to > reset the system. For the errors that do not generate machine > check exceptions, the EN flag has no meaning. > > Note the "should log and clear". We just clear ... just need to shuffle some > code > in mce.c to add the logging.
Sure, we can log those. > But we still need something like Rui's patch - calling mcelog() > doesn't ensure that we see something on the console about possible > cause of the problem. So you're saying we should drain the mcelog buffer to the console in such situations before we panic? If so, there's drain_mcelog_buffer() which could be changed to call print_mce() instead of going to the x86_mce_decoder_chain. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/