+static void machine_check_under_kdump(struct pt_regs *regs, long error_code)
+{
+       if (mca_cfg.kdump_cpu == smp_processor_id())
+               pr_emerg("MCE triggered when kdumping. If you are lucky enough, 
you will have a kdump. Otherwise, this is a dying message.\n");

I'm worried about the SRAR case here.  Your code just returns, which will 
trigger the same machine check again. The system will spin forever printing 
this message.

I think you have to look at MCG_STATUS and scan the machine check banks to make 
a choice.  There are some simple cases:

  MCG_STATUS.RIPV=0 -> cannot return (where will the cpu go - you have no idea!)
  SRAO -> safe to just return
  SRAR -> should not return

But the rest may require some thought.  If there is a PCC=1 error, then you may 
end up with a corrupt dump. Perhaps this case will already be covered by 
RPIV==0?

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to