On February 14, 2019 2:14:29 AM PST, Peter Zijlstra <pet...@infradead.org> wrote: >On Wed, Feb 13, 2019 at 02:49:47PM -0800, Andy Lutomirski wrote: > >> Do we need to backport this thing? > >Possibly, just to be safe. > >> The problem can’t be too widespread or we would have heard of it >before. > >Yes, so far we've been lucky. > >--- >Subject: sched/x86: Save [ER]FLAGS on context switch >From: Peter Zijlstra <pet...@infradead.org> >Date: Thu Feb 14 10:30:52 CET 2019 > >Effectively revert commit: > > 2c7577a75837 ("sched/x86_64: Don't save flags on context switch") > >Specifically because SMAP uses FLAGS.AC which invalidates the claim >that the kernel has clean flags. > >In particular; while preemption from interrupt return is fine (the >IRET frame on the exception stack contains FLAGS) it breaks any code >that does synchonous scheduling, including preempt_enable(). > >This has become a significant issue ever since commit: > >5b24a7a2aa20 ("Add 'unsafe' user access functions for batched >accesses") > >provided for means of having 'normal' C code between STAC / CLAC, >exposing the FLAGS.AC state. So far this hasn't led to trouble, >however fix it before it comes apart. > >Fixes: 5b24a7a2aa20 ("Add 'unsafe' user access functions for batched >accesses") >Acked-by: Andy Lutomirski <l...@amacapital.net> >Reported-by: Julien Thierry <julien.thie...@arm.com> >Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> >--- > arch/x86/entry/entry_32.S | 2 ++ > arch/x86/entry/entry_64.S | 2 ++ > arch/x86/include/asm/switch_to.h | 1 + > 3 files changed, 5 insertions(+) > >--- a/arch/x86/entry/entry_32.S >+++ b/arch/x86/entry/entry_32.S >@@ -650,6 +650,7 @@ ENTRY(__switch_to_asm) > pushl %ebx > pushl %edi > pushl %esi >+ pushfl > > /* switch stack */ > movl %esp, TASK_threadsp(%eax) >@@ -672,6 +673,7 @@ ENTRY(__switch_to_asm) > #endif > > /* restore callee-saved registers */ >+ popfl > popl %esi > popl %edi > popl %ebx >--- a/arch/x86/entry/entry_64.S >+++ b/arch/x86/entry/entry_64.S >@@ -291,6 +291,7 @@ ENTRY(__switch_to_asm) > pushq %r13 > pushq %r14 > pushq %r15 >+ pushfq > > /* switch stack */ > movq %rsp, TASK_threadsp(%rdi) >@@ -313,6 +314,7 @@ ENTRY(__switch_to_asm) > #endif > > /* restore callee-saved registers */ >+ popfq > popq %r15 > popq %r14 > popq %r13 >--- a/arch/x86/include/asm/switch_to.h >+++ b/arch/x86/include/asm/switch_to.h >@@ -40,6 +40,7 @@ asmlinkage void ret_from_fork(void); > * order of the fields must match the code in __switch_to_asm(). > */ > struct inactive_task_frame { >+ unsigned long flags; > #ifdef CONFIG_X86_64 > unsigned long r15; > unsigned long r14;
This implies we invoke schedule -- a restricted operation (consider may_sleep) during execution of STAC-enabled code, but *not* as an exception or interrupt, since those preserve the flags. I have serious concerns about this. This is more or less saying that we have left an unlimited gap and have had AC escape. Does *anyone* see a need to allow this? I got a question at LPC from someone about this, and what they were trying to do once all the layers had been unwound was so far down the wrong track for a root problem that actually has a very simple solution. -- Sent from my Android device with K-9 Mail. Please excuse my brevity.