On Wed, Apr 14, 2021 at 9:55 AM David Laight <david.lai...@aculab.com> wrote: > > From: Arjun Roy > > Sent: 13 April 2021 23:04 > > > > On Tue, Apr 13, 2021 at 2:19 PM David Laight <david.lai...@aculab.com> > > wrote: > > > > > > > If we're special-casing 64-bit architectures anyways - unrolling the > > > > 32B copy_from_user() for struct rseq_cs appears to be roughly 5-10% > > > > savings on x86-64 when I measured it (well, in a microbenchmark, not > > > > in rseq_get_rseq_cs() directly). Perhaps that could be an additional > > > > avenue for improvement here. > > > > > > The killer is usually 'user copy hardening'. > > > It significantly slows down sendmsg() and recvmsg(). > > > I've got measurable performance improvements by > > > using __copy_from_user() when the buffer since has > > > already been checked - but isn't a compile-time constant. > > > > > > There is also scope for using _get_user() when reading > > > iovec[] (instead of copy_from_user()) and doing all the > > > bound checks (etc) in the loop. > > > That gives a measurable improvement for writev("/dev/null"). > > > I must sort those patches out again. > > > > > > David > > > > > > > In this case I mean replacing copy_from_user(rseq_cs, urseq_cs, > > sizeof(*rseq_cs)) with 4 (x8B=32 total) unsafe_get_user() (wrapped in > > user_read_access_begin/end()) which I think would just bypass user > > copy hardening (as far as I can tell). > > Yes that is one advantage over any of the get_user() calls. > You also lose all the 'how shall we optimise this' checks > in copy_from_user(). > > Repeated unsafe_get_user() calls are crying out for an optimisation. > You get something like: > failed = 0; > copy(); > if (failed) goto error; > copy(); > if (failed) goto error; > Where 'failed' is set by the fault handler. > > This could be optimised to: > failed = 0; > copy(); > copy(); > if (failed) goto error; > Even if it faults on every invalid address it probably > doesn't matter - no one cares about that path.
On which arch are you looking at ? On x86_64 at least, code generation is just perfect. Not even a conditional jmp, it is all handled by exceptions (if any) stac copy(); copy(); clac <out_of_line> efault_end: do error recovery. > > I've not really looked at how it could be achieved though. > > It might be that the 'asm goto with outputs' variant > manages to avoid the compare and jump. > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 > 1PT, UK > Registration No: 1397386 (Wales)