On 9/17/22 03:16, Christophe Leroy wrote: > Le 16/09/2022 à 07:05, Samuel Holland a écrit : >> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible >> to switch away from a task inside copy_{from,to}_user. This left the CPU >> with userspace access enabled until after the next IRQ or privilege >> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when >> switching back to the original task, the userspace access would fault: > > This is not supposed to happen. You never switch away from a task > magically. Task switch will always happen in an interrupt, that means > copy_{from,to}_user() get interrupted.
That makes sense, the interrupt handler is responsible for saving the KUAP status. It looks like neither DEFINE_INTERRUPT_HANDLER_RAW nor any of its users (performance_monitor_exception(), do_slb_fault()) do that. Yet they still call one of the interrupt_return variants, which restores the status from the stack. > Whenever an interrupt is taken, kuap_save_amr_and_lock() macro is used > to save KUAP status into the stack then lock KUAP access. At interrupt > exit, kuap_kernel_restore() macro or function is used to restore KUAP > access from the stack. At the time the task switch happens, KUAP access > is expected to be locked. During task switch, the stack is switched so > the KUAP status is taken back from the new task's stack. What if another task calls schedule() from kernel process context, and the scheduler switches to a task that had been preempted inside copy_{from,to}_user()? Then there is no interrupt involved, and I don't see where kuap_kernel_restore() would get called. > Your fix suggests that there is some path where the KUAP status is not > properly saved and/or restored. Did you try running with > CONFIG_PPC_KUAP_DEBUG ? It should warn whenever a KUAP access is left > unlocked. > >> >> Kernel attempted to write user page (3fff7ab68190) - exploit attempt? >> (uid: 65536) >> ------------[ cut here ]------------ >> Bug: Write fault blocked by KUAP! >> WARNING: CPU: 56 PID: 4939 at arch/powerpc/mm/fault.c:228 >> ___do_page_fault+0x7b4/0xaa0 >> CPU: 56 PID: 4939 Comm: git Tainted: G W >> 5.19.8-00005-gba424747260d #1 >> NIP: c0000000000555e4 LR: c0000000000555e0 CTR: c00000000079d9d0 >> REGS: c00000008f507370 TRAP: 0700 Tainted: G W >> (5.19.8-00005-gba424747260d) >> MSR: 9000000000021033 <SF,HV,ME,IR,DR,RI,LE> CR: 28042222 XER: 20040000 >> CFAR: c000000000123780 IRQMASK: 3 >> NIP [c0000000000555e4] ___do_page_fault+0x7b4/0xaa0 >> LR [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0 >> Call Trace: >> [c00000008f507610] [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0 >> (unreliable) >> [c00000008f5076c0] [c000000000055938] do_page_fault+0x68/0x130 >> [c00000008f5076f0] [c000000000008914] data_access_common_virt+0x194/0x1f0 >> --- interrupt: 300 at __copy_tofrom_user_base+0x9c/0x5a4 > > ... > >> >> Fix this by saving and restoring the kernel-side AMR/IAMR values when >> switching tasks. > > As explained above, KUAP access should be locked at that time, so saving > and restoring it should not have any effect. If it does, it means > something goes wrong somewhere else. > >> >> Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU") >> Signed-off-by: Samuel Holland <sam...@sholland.org> >> --- >> I have no idea if this is the right change to make, and it could be >> optimized, but my system has been stable with this patch for 5 days now. >> >> Without the patch, I hit the bug every few minutes when my load average >> is <1, and I hit it immediately if I try to do a parallel kernel build. > > Great, then can you make a try with CONFIG_PPC_KUAP_DEBUG ? Yes, I will try this out in the next few days. Regards, Samuel