The remaining callers of kernel_fpu_begin() in 64-bit kernels don't use 387 instructions, so there's no need to sanitize FCW. Skip it to get the performance we lost back.
Reported-by: Krzysztof Olędzki <o...@ans.pl> Signed-off-by: Andy Lutomirski <l...@kernel.org> --- arch/x86/include/asm/fpu/api.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index e95a06845443..6e826796a734 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -40,7 +40,19 @@ extern void fpregs_mark_activate(void); /* Code that is unaware of kernel_fpu_begin_mask() can use this */ static inline void kernel_fpu_begin(void) { +#ifdef CONFIG_X86_64 + /* + * Any 64-bit code that uses 387 instructions must explicitly request + * KFPU_387. + */ + kernel_fpu_begin_mask(KFPU_XYZMM); +#else + /* + * 32-bit kernel code may use 387 operations as well as SSE2, etc, + * as long as it checks that the CPU has the required capability. + */ kernel_fpu_begin_mask(KFPU_387 | KFPU_XYZMM); +#endif } /* -- 2.29.2