On Mon, Jan 5, 2015 at 4:44 PM, Luck, Tony <tony.l...@intel.com> wrote:
> We now switch to the kernel stack when a machine check interrupts
> during user mode.  This means that we can perform recovery actions
> in the tail of do_machine_check()
>
> Signed-off-by: Tony Luck <tony.l...@intel.com>
>
> ---
> On top of Andy's x86/paranoid branch
> Andy: Should I really move that:
>         pr_err("Uncorrected hardware memory error ...
> inside the ist_begin_non_atomic() section?
>

I think I like it as is.

[...]

> @@ -1220,6 +1177,26 @@ void do_machine_check(struct pt_regs *regs, long 
> error_code)
>         mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
>  out:
>         sync_core();
> +
> +       if (recover_paddr == ~0ull)
> +               goto done;
> +
> +       pr_err("Uncorrected hardware memory error in user-access at %llx",
> +                recover_paddr);

printk is safe from IRQ context, so this should be okay unless we've
totally screwed up.  And, if we totally screwed up, seeing this before
the BUGs in ist_begin_non_atomic would be nice.

> +       /*
> +        * We must call memory_failure() here even if the current process is
> +        * doomed. We still need to mark the page as poisoned and alert any
> +        * other users of the page.
> +        */
> +       ist_begin_non_atomic(regs);
> +       local_irq_enable();
> +       if (memory_failure(recover_paddr >> PAGE_SHIFT, MCE_VECTOR, flags) < 
> 0) {
> +               pr_err("Memory error not recovered");
> +               force_sig(SIGBUS, current);
> +       }
> +       local_irq_disable();
> +       ist_end_non_atomic();
> +done:
>         ist_exit(regs, prev_state);
>  }

For the context-related bits:

Reviewed-by: Andy Lutomirski <l...@amacapital.net>

Should I stick this in my -next branch so it can stew?

--Andy


-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to