On Tue, 2025-04-08 at 19:49 +0100, Mark Brown wrote: > +int arch_shstk_validate_clone(struct task_struct *t, > + struct vm_area_struct *vma, > + struct page *page, > + struct kernel_clone_args *args) > +{ > + /* > + * SSP is aligned, so reserved bits and mode bit are a zero, just mark > + * the token 64-bit. > + */ > + void *maddr = kmap_local_page(page); > + int offset; > + unsigned long addr, ssp; > + u64 expected; > + > + if (!features_enabled(ARCH_SHSTK_SHSTK)) > + return 0; > + > + ssp = args->shadow_stack_pointer; > + addr = ssp - SS_FRAME_SIZE; > + expected = ssp | BIT(0); > + offset = offset_in_page(addr); > + > + if (!cmpxchg_to_user_page(vma, page, addr, (unsigned long *)(maddr + > offset), > + expected, 0)) > + return -EINVAL; > + set_page_dirty_lock(page); > + > + return 0; > +} > +
First of all, sorry for not contributing on this since v9. I've had an unusual enormous project conflict (TDX) combined with my test HW dieing. I tested v15 on x86 and saw a couple problems: 1. I think kmap_local_page() is supposed to be paired kunmap_local(). But shstk is not supported on highmem systems, so let's just use page_address(). 2. Some off by one (frame) errors that cause the clone3 test to fail on x86. Both fixed in the diff below, but in debugging the off-by-one errors I've realized this implementation wastes a shadow stack frame. On x86 when that token is consumed normally it would have: SSP = token_addr + 8 I always assumed the HW token consumption behavior was to try to save a frame on the shadow stack. Once the token is consumed it is useless. So might as well reuse the frame for the next push. But the clone3 behavior is different than the normal token consumption logic. Instead it will have SSP *at* the token, which will then have the next call push and leave the zero frame as wasted space. Do we want this? On arm there is SHADOW_STACK_SET_MARKER, which leaves a marker token. But on clone3 it will also leave behind a zero frame from the CMPXCHGed token. So if you use SHADOW_STACK_SET_MARKER you get two marker tokens. And on x86 you will get one one for clone3 but not others, until x86 implements SHADOW_STACK_SET_MARKER. At which point x86 has to diverge from arm (bad) or also have the double marker frame. The below fixes the x86 functionally, but what do you think of the wasted frame? One fix would be to change shadow_stack_pointer to shadow_stack_token, and then have each arch consume it in the normal HW way, leaving the new thread with: SSP = clone_args->shadow_stack_token + 8 diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c index 056e2c9ec305..2b0f84ae4367 100644 --- a/arch/x86/kernel/shstk.c +++ b/arch/x86/kernel/shstk.c @@ -200,20 +200,19 @@ int arch_shstk_validate_clone(struct task_struct *t, * SSP is aligned, so reserved bits and mode bit are a zero, just mark * the token 64-bit. */ - void *maddr = kmap_local_page(page); + void *maddr = page_address(page); int offset; - unsigned long addr, ssp; + unsigned long ssp; u64 expected; if (!features_enabled(ARCH_SHSTK_SHSTK)) return 0; ssp = args->shadow_stack_pointer; - addr = ssp - SS_FRAME_SIZE; - expected = ssp | BIT(0); - offset = offset_in_page(addr); + expected = (ssp + SS_FRAME_SIZE) | BIT(0); + offset = offset_in_page(ssp); - if (!cmpxchg_to_user_page(vma, page, addr, (unsigned long *)(maddr + offset), + if (!cmpxchg_to_user_page(vma, page, ssp, (unsigned long *)(maddr + offset), expected, 0)) return -EINVAL; set_page_dirty_lock(page);