Commit-ID:  fead35c68926682c90c995f22b48f1c8d78865c1
Gitweb:     http://git.kernel.org/tip/fead35c68926682c90c995f22b48f1c8d78865c1
Author:     Yazen Ghannam <yazen.ghan...@amd.com>
AuthorDate: Sat, 30 Apr 2016 14:33:57 +0200
Committer:  Ingo Molnar <mi...@kernel.org>
CommitDate: Tue, 3 May 2016 08:24:17 +0200

x86/mce: Detect local MCEs properly

Check the MCG_STATUS_LMCES bit on Intel to verify that current MCE is
local. It is always local on AMD.

Signed-off-by: Yazen Ghannam <yazen.ghan...@amd.com>
[ Massaged it a bit. Reflowed comments. Shut up -Wmaybe-uninitialized. ]
Signed-off-by: Borislav Petkov <b...@suse.de>
Cc: Andy Lutomirski <l...@amacapital.net>
Cc: Borislav Petkov <b...@alien8.de>
Cc: Brian Gerst <brge...@gmail.com>
Cc: Denys Vlasenko <dvlas...@redhat.com>
Cc: H. Peter Anvin <h...@zytor.com>
Cc: Linus Torvalds <torva...@linux-foundation.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Tony Luck <tony.l...@intel.com>
Cc: linux-edac <linux-e...@vger.kernel.org>
Link: http://lkml.kernel.org/r/1462019637-16474-8-git-send-email...@alien8.de
Signed-off-by: Ingo Molnar <mi...@kernel.org>
---
 arch/x86/kernel/cpu/mcheck/mce.c | 33 ++++++++++++++++++++-------------
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index c356f47..aeda446 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1038,11 +1038,12 @@ void do_machine_check(struct pt_regs *regs, long 
error_code)
        int i;
        int worst = 0;
        int severity;
+
        /*
         * Establish sequential order between the CPUs entering the machine
         * check handler.
         */
-       int order;
+       int order = -1;
        /*
         * If no_way_out gets set, there is no safe way to recover from this
         * MCE.  If mca_cfg.tolerant is cranked up, we'll try anyway.
@@ -1056,7 +1057,12 @@ void do_machine_check(struct pt_regs *regs, long 
error_code)
        DECLARE_BITMAP(toclear, MAX_NR_BANKS);
        DECLARE_BITMAP(valid_banks, MAX_NR_BANKS);
        char *msg = "Unknown";
-       int lmce = 0;
+
+       /*
+        * MCEs are always local on AMD. Same is determined by MCG_STATUS_LMCES
+        * on Intel.
+        */
+       int lmce = 1;
 
        /* If this CPU is offline, just bail out. */
        if (cpu_is_offline(smp_processor_id())) {
@@ -1095,19 +1101,20 @@ void do_machine_check(struct pt_regs *regs, long 
error_code)
                kill_it = 1;
 
        /*
-        * Check if this MCE is signaled to only this logical processor
+        * Check if this MCE is signaled to only this logical processor,
+        * on Intel only.
         */
-       if (m.mcgstatus & MCG_STATUS_LMCES)
-               lmce = 1;
-       else {
-               /*
-                * Go through all the banks in exclusion of the other CPUs.
-                * This way we don't report duplicated events on shared banks
-                * because the first one to see it will clear it.
-                * If this is a Local MCE, then no need to perform rendezvous.
-                */
+       if (m.cpuvendor == X86_VENDOR_INTEL)
+               lmce = m.mcgstatus & MCG_STATUS_LMCES;
+
+       /*
+        * Go through all banks in exclusion of the other CPUs. This way we
+        * don't report duplicated events on shared banks because the first one
+        * to see it will clear it. If this is a Local MCE, then no need to
+        * perform rendezvous.
+        */
+       if (!lmce)
                order = mce_start(&no_way_out);
-       }
 
        for (i = 0; i < cfg->banks; i++) {
                __clear_bit(i, toclear);

Reply via email to