From: Linus Torvalds > Sent: 21 November 2024 22:16 > > On Thu, 21 Nov 2024 at 13:40, Josh Poimboeuf <jpoim...@kernel.org> wrote: > > > > The profile is showing futex_get_value_locked(): > > Ahh. > > > That has several callers, so we can probably just use get_user() there? > > Yeah, that's the simplest thing. That thing isn't even some inline > function, so the real cost is the call. > > That said, exactly because it's not inlined, and calls are expensive, > and this is apparently really critical, we can just do it with the > full "unsafe_get_user()" model. > > It's not so complicated. The attached patch is untested, but I did > check that it generates almost perfect code: > > mov %gs:0x0,%rax # current > incl 0x1a9c(%rax) # current->pagefault_disable++ > movabs $0x123456789abcdef,%rcx # magic virtual address size > cmp %rsi,%rcx # address masking > sbb %rcx,%rcx > or %rsi,%rcx > stac # enable user space acccess > mov (%rcx),%ecx # get the value > clac # disable user space access > decl 0x1a9c(%rax) # current->pagefault_disable-- > mov %ecx,(%rdi) # save the value > xor %eax,%eax # return 0 > ret > > (with the error case for the page fault all out-of-line).
I presume you are assuming an earlier access_ok() call? IIRC all x86 from 286 onwards fault accesses that cross the ~0 to 0 boundary. So you don't need to have called access_ok() prior to the above for non-byte accesses. Even for byte accesses (and with ~0 being a valid address) do the address mask before pagefault_disable++ and the error test is 'jc label' after the 'cmp'. Don't you also get better code for an 'asm output with goto' form? While it requires the caller have a 'label: return -EFAULT;' somewhere that is quite common in kernel code. > So this should be _faster_ than the old __get_user(), because while > the address masking is not needed, it's cheaper than the function call > used to be and the error handling is better. Perhaps get_user() should be the function call and __get_user() inlined. Both including address validation but different calling conventions? (After fixing the code that doesn't want address masking.) ... > Now, we could possibly say "just remove the fence in __get_user() > entirely", but that would involve moving it to access_ok(). How valid would it be to assume an explicit access_ok() was far enough away from the get_user() that you couldn't speculate all the way to something that used the read value to perform another kernel access? > And then it wouldn't actually speed anything up (except the horrid > architecture-specific kernel uses that then don't call access_ok() at > all - and we don't care about *those*). Find and kill :-) David > > Linus - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)