On Wed, Nov 11, 2015 at 02:01:51PM -0800, Tony Luck wrote: > We used to have a special ring buffer for deferred errors that > was used to mark problem pages. We replaced that with a genpool. > Then later converted mce_log() to also use the same genpool. As > a result we end up adding all deferred errors to the genpool twice. > > Rearrange this code. Make sure to set the m.severity and m.usable_addr > fields for deferred errors. Then if flags and mca_cfg.dont_log_ce mean > we call mce_log() we are done, because that will add this entry to the > genpool. > > If we skipped mce_log(), then we still want to take action for the > deferred error, so add to the genpool. > > Changed the name of the boolean "error_logged" to "error_seen", we > should set it whether of not we logged an error because the return > value from machine_check_poll() is used to decide whether storms > have subsided or not. > > Reported-by: Chen, Gong <gong.chen.linux.intel.com> > Signed-off-by: Tony Luck <tony.l...@intel.com> > --- > arch/x86/kernel/cpu/mcheck/mce.c | 24 +++++++++++++----------- > 1 file changed, 13 insertions(+), 11 deletions(-)
Applied, thanks. Btw, looking at that mce.usable_addr, it doesn't make a whole lotta sense to me and we can use mce_usable_address() directly instead and use the byte in struct mce for something more important. So how about I kill it (diff ontop of yours): --- diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h index 03429da2fa80..2184943341bf 100644 --- a/arch/x86/include/uapi/asm/mce.h +++ b/arch/x86/include/uapi/asm/mce.h @@ -16,7 +16,7 @@ struct mce { __u8 cpuvendor; /* cpu vendor as encoded in system.h */ __u8 inject_flags; /* software inject flags */ __u8 severity; - __u8 usable_addr; + __u8 pad; __u32 cpuid; /* CPUID 1 EAX */ __u8 cs; /* code segment */ __u8 bank; /* machine check bank */ diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 6531cb46803c..fb8b1db7b150 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -484,7 +484,7 @@ static int srao_decode_notifier(struct notifier_block *nb, unsigned long val, if (!mce) return NOTIFY_DONE; - if (mce->usable_addr && (mce->severity == MCE_AO_SEVERITY)) { + if (mce_usable_address(mce) && (mce->severity == MCE_AO_SEVERITY)) { pfn = mce->addr >> PAGE_SHIFT; memory_failure(pfn, MCE_VECTOR, 0); } @@ -610,12 +610,9 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b) severity = mce_severity(&m, mca_cfg.tolerant, NULL, false); - if (severity == MCE_DEFERRED_SEVERITY && memory_error(&m)) { - if (m.status & MCI_STATUS_ADDRV) { + if (severity == MCE_DEFERRED_SEVERITY && memory_error(&m)) + if (m.status & MCI_STATUS_ADDRV) m.severity = severity; - m.usable_addr = mce_usable_address(&m); - } - } /* * Don't get the IP here because it's unlikely to @@ -623,7 +620,7 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b) */ if (!(flags & MCP_DONTLOG) && !mca_cfg.dont_log_ce) mce_log(&m); - else if (m.usable_addr) { + else if (mce_usable_address(&m)) { /* * Although we skipped logging this, we still want * to take action. Add to the pool so the registered @@ -1091,7 +1088,6 @@ void do_machine_check(struct pt_regs *regs, long error_code) /* assuming valid severity level != 0 */ m.severity = severity; - m.usable_addr = mce_usable_address(&m); mce_log(&m); -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/