Michael Ellerman's on February 8, 2019 11:04 am: > Nicholas Piggin <npig...@gmail.com> writes: >> Russell Currey's on February 6, 2019 4:28 pm: >>> Without restoring the IAMR after idle, execution prevention on POWER9 >>> with Radix MMU is overwritten and the kernel can freely execute userspace >>> without >>> faulting. >>> >>> This is necessary when returning from any stop state that modifies user >>> state, as well as hypervisor state. >>> >>> To test how this fails without this patch, load the lkdtm driver and >>> do the following: >>> >>> echo EXEC_USERSPACE > /sys/kernel/debug/provoke-crash/DIRECT >>> >>> which won't fault, then boot the kernel with powersave=off, where it >>> will fault. Applying this patch will fix this. >>> >>> Fixes: 3b10d0095a1e ("powerpc/mm/radix: Prevent kernel execution of user >>> space") >>> Cc: <sta...@vger.kernel.org> >>> Signed-off-by: Russell Currey <rus...@russell.cc> >> >> Good catch and debugging. This really should be a quirk, we don't want >> to have to restore this thing on a thread switch. > > I'm not sure I follow. We don't context switch it on Radix, but we do > on hash if pkeys are enabled.
Badly worded, I mean a hardware quirk. It should follow thread switches. Still, avoiding it for the no-loss case is better than nothing. We can just revisit it as an optimization if future hardware does not require the restore. >> Can we put it under a CONFIG option if we're not using IAMR? > > We'll always be using it with Radix, and we might be using it for pkeys > on hash, unless pkeys are compiled out. But I don't really expect anyone > to be running with pkeys compiled out. > > So I think the only case we could optimise is that we're on hash and the > current thread has an IAMR of 0, then we could just not restore > (assuming we come out of idle with IAMR=0). > > But maybe I'm not understanding. Nah it sounds like more trouble than it's worth in that case. Thanks, Nick