On Sun, Sep 6, 2020 at 2:21 PM Borislav Petkov <b...@alien8.de> wrote: > > Hi, > > Ingo and I talked about this thing this morning and tglx has had it on > his to-fix list too so here's a first attempt at it. > > Below is just a brain dump of what we talked about so let's start with > it and see where it would take us. > > Thx. > > --- > > From: Borislav Petkov <b...@suse.de> > > ... without any exception handling and tracing. > > If an exception needs to be handled while reading an MSR - which is in > most of the cases caused by a #GP on a non-existent MSR - then this > is most likely the incarnation of a BIOS or a hardware bug. Such bug > violates the architectural guarantee that MSR banks are present with all > MSRs belonging to them. > > The proper fix belongs in the hardware/firmware - not in the kernel. > > Handling exceptions while in #MC and while an NMI is being handled would > cause the nasty NMI nesting issue because of the shortcoming of IRET > of reenabling NMIs when executed. And the machine is in an #MC context > already so <Deity> be at its side. > > Tracing MSR accesses while in #MC is another no-no due to tracing being > inherently a bad idea in atomic context: > > vmlinux.o: warning: objtool: do_machine_check()+0x4a: call to mce_rdmsrl() > leaves .noinstr.text section > > so remove all that "additional" functionality from mce_rdmsrl() and > concentrate on solely reading the MSRs. > > Signed-off-by: Borislav Petkov <b...@suse.de> > Cc: Ingo Molnar <mi...@kernel.org> > --- > arch/x86/kernel/cpu/mce/core.c | 18 +++++++----------- > 1 file changed, 7 insertions(+), 11 deletions(-) > > diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c > index 0ba24dfffdb2..14ebdf3e22f3 100644 > --- a/arch/x86/kernel/cpu/mce/core.c > +++ b/arch/x86/kernel/cpu/mce/core.c > @@ -376,7 +376,7 @@ static int msr_to_offset(u32 msr) > /* MSR access wrappers used for error injection */ > static u64 mce_rdmsrl(u32 msr) > { > - u64 v; > + DECLARE_ARGS(val, low, high); > > if (__this_cpu_read(injectm.finished)) { > int offset = msr_to_offset(msr); > @@ -386,17 +386,13 @@ static u64 mce_rdmsrl(u32 msr) > return *(u64 *)((char *)this_cpu_ptr(&injectm) + offset); > } > > - if (rdmsrl_safe(msr, &v)) { > - WARN_ONCE(1, "mce: Unable to read MSR 0x%x!\n", msr); > - /* > - * Return zero in case the access faulted. This should > - * not happen normally but can happen if the CPU does > - * something weird, or if the code is buggy. > - */ > - v = 0; > - } > + /* > + * RDMSR on MCA MSRs should not fault. If they do, this is very much > an > + * architectural violation and needs to be reported to hw vendor. > + */ > + asm volatile("rdmsr" : EAX_EDX_RET(val, low, high) : "c" (msr));
I don't like this. Plain rdmsrl() will at least print a nice error if it fails. Perhaps we should add a read_msr_panic() variant that panics on failure? Or, if there is just this one case, then we can use rdmsrl_safe() and print a nice error and panic on failure.