fpu__drop() has an explicit fwait which under some conditions can trigger
a fixable FPU exception while in kernel. Thus, we should attempt to fixup
the exception first, and only call notify_die() if the fixup failed just
like in do_general_protection(). The original call sequence incorrectly
triggers KDB entry on debug kernels under particular FPU-intensive
workloads. This issue had been privately observed, fixed, and tested 
on 4.9.98, while this patch brings the fix to the upstream.

Signed-off-by: Siarhei Liakh <siarhei.li...@concurrent-rt.com>
---

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index a535dd6..68d77a3 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -835,16 +835,18 @@ static void math_error(struct pt_regs *regs, int 
error_code, int trapnr)
        char *str = (trapnr == X86_TRAP_MF) ? "fpu exception" :
                                                "simd exception";
 
-       if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, SIGFPE) == 
NOTIFY_STOP)
-               return;
        cond_local_irq_enable(regs);
 
        if (!user_mode(regs)) {
-               if (!fixup_exception(regs, trapnr)) {
-                       task->thread.error_code = error_code;
-                       task->thread.trap_nr = trapnr;
+               if (fixup_exception(regs, trapnr))
+                       return;
+
+               task->thread.error_code = error_code;
+               task->thread.trap_nr = trapnr;
+
+               if (notify_die(DIE_TRAP, str, regs, error_code,
+                                       trapnr, SIGFPE) != NOTIFY_STOP)
                        die(str, regs, error_code);
-               }
                return;
        }
 

Reply via email to