On Tue, Nov 27, 2018 at 7:32 AM Sean Christopherson <sean.j.christopher...@intel.com> wrote: > > On Thu, Nov 22, 2018 at 09:41:19AM +0100, Ingo Molnar wrote: > > > > * Andy Lutomirski <l...@kernel.org> wrote: > > > > > One of Linus' favorite hobbies seems to be looking at OOPSes and > > > decoding the error code in his head. This is not one of my favorite > > > hobbies :) > > > > > > Teach the page fault OOPS hander to decode the error code. If it's > > > a !USER fault from user mode, print an explicit note to that effect > > > and print out the addresses of various tables that might cause such > > > an error. > > > > > > With this patch applied, if I intentionally point the LDT at 0x0 and > > > run the x86 selftests, I get: > > > > > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 > > > HW error: normal kernel read fault > > > This was a system access from user code > > > IDT: 0xfffffe0000000000 (limit=0xfff) GDT: 0xfffffe0000001000 (limit=0x7f) > > > LDTR: 0x50 -- base=0x0 limit=0xfff7 > > > TR: 0x40 -- base=0xfffffe0000003000 limit=0x206f > > > PGD 800000000456e067 P4D 800000000456e067 PUD 4623067 PMD 0 > > > SMP PTI > > > CPU: 0 PID: 153 Comm: ldt_gdt_64 Not tainted 4.19.0+ #1317 > > > Hardware name: ... > > > RIP: 0033:0x401454 > > > > I've applied your series, with one small edit, the following message: > > > > > HW error: normal kernel read fault > > > > will IMHO confuse the heck out of users, thinking that their hardware is > > broken... > > > > Yes, the message is accurate, in MM pagefault language it's indeed the HW > > error code, but it's a language very few people speak. > > > > So I edited it over to say '#PF error code'. I also applied a few other > > minor cleanups - see the changelog below. > > I responded to the original thread a hair too late... > > What about something like this instead of manually handling the case > where error_code==0 so that we get e.g. "[KERNEL] [READ]" instead of > "normal kernel read fault"? Getting "[PROT] [KERNEL] [READ]" seems > useful. > > IMO "[normal kernel read fault]" followed by "This was a system access > from user code" is still confusing. > > --- > 8b29ee4351d5c625aa9ca2765f8da5e Mon Sep 17 00:00:00 2001 > From: Sean Christopherson <sean.j.christopher...@intel.com> > Date: Tue, 27 Nov 2018 07:09:57 -0800 > Subject: [PATCH] x86/fault: Print "KERNEL" and "READ" for #PF error codes > > ...and explicitly state that it's a "code" that's being printed. > > Cc: Andy Lutomirski <l...@kernel.org> > Cc: Borislav Petkov <b...@alien8.de> > Cc: Dave Hansen <dave.han...@linux.intel.com> > Cc: H. Peter Anvin <h...@zytor.com> > Cc: Linus Torvalds <torva...@linux-foundation.org> > Cc: Peter Zijlstra <pet...@infradead.org> > Cc: Rik van Riel <r...@surriel.com> > Cc: Thomas Gleixner <t...@linutronix.de> > Cc: Yu-cheng Yu <yu-cheng...@intel.com> > Cc: linux-kernel@vger.kernel.org > Cc: Ingo Molnar <mi...@kernel.org> > Signed-off-by: Sean Christopherson <sean.j.christopher...@intel.com> > --- > arch/x86/mm/fault.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > index 2ff25ad33233..510e263c256b 100644 > --- a/arch/x86/mm/fault.c > +++ b/arch/x86/mm/fault.c > @@ -660,8 +660,10 @@ show_fault_oops(struct pt_regs *regs, unsigned long > error_code, unsigned long ad > err_str_append(error_code, err_txt, X86_PF_RSVD, "[RSVD]" ); > err_str_append(error_code, err_txt, X86_PF_INSTR, "[INSTR]"); > err_str_append(error_code, err_txt, X86_PF_PK, "[PK]" ); > - > - pr_alert("#PF error: %s\n", error_code ? err_txt : "[normal kernel > read fault]"); > + err_str_append(~error_code, err_txt, X86_PF_USER, "[KERNEL]"); > + err_str_append(~error_code, err_txt, X86_PF_WRITE | X86_PF_INSTR, > + "[READ]"); > + pr_alert("#PF error code: %s\n", err_txt); >
Seems generally nice, but I would suggest making the bit-not-set name be another parameter to err_str_append(). I'm also slightly uneasy about making "KERNEL" look like a bit, but I guess it doesn't bother me too much. Want to send a real patch?