Le 30/04/2020 à 07:39, Christoph Hellwig a écrit :
On Thu, Apr 30, 2020 at 08:39:00AM +0800, Jeremy Kerr wrote:
Hi Christophe,

Just use the proper non __-prefixed get/put_user variants where
that is not done yet.

But it means you are doing the access_ok() check everytime, which is
what is to be avoided by doing the access_ok() once then using the
__-prefixed variant.

5 out of 8 of these are just a access_ok(); simple_read_from_buffer().

For the cases where it's multiple __put/get_user()s, the max will be 5.
(for the mbox access). Is that worth optimising the access_ok() checks?

access_ok is just trivial comparism to the segment limit, I don't
think it has a relavant performance impact.


I think it has an impact. See the difference between the two following trivial functions:

int test1(unsigned long val, unsigned long *addr)
{
        return __put_user(val, addr);
}

int test2(unsigned long val, unsigned long *addr)
{
        return put_user(val, addr);
}


00000000 <test1>:
   0:   39 20 00 00     li      r9,0
   4:   90 64 00 00     stw     r3,0(r4)
   8:   7d 23 4b 78     mr      r3,r9
   c:   4e 80 00 20     blr

00000000 <test2>:
   0:   81 22 04 38     lwz     r9,1080(r2)
   4:   7c 6a 1b 78     mr      r10,r3
   8:   7f 89 20 40     cmplw   cr7,r9,r4
   c:   41 9c 00 24     blt     cr7,30 <test2+0x30>
  10:   7d 24 48 50     subf    r9,r4,r9
  14:   38 60 ff f2     li      r3,-14
  18:   2b 89 00 02     cmplwi  cr7,r9,2
  1c:   4c 9d 00 20     blelr   cr7
  20:   39 20 00 00     li      r9,0
  24:   91 44 00 00     stw     r10,0(r4)
  28:   7d 23 4b 78     mr      r3,r9
  2c:   4e 80 00 20     blr
  30:   38 60 ff f2     li      r3,-14
  34:   4e 80 00 20     blr


It looks like GCC is smart enough to read the limit in task struct only once when we have two consecutive put_user() but there is still some difference:

int test3(unsigned long val, unsigned long *addr)
{
        return put_user(val, addr) ? : put_user(val, addr + 1);
}

int test4(unsigned long val, unsigned long *addr)
{
        if (!access_ok(addr, sizeof(*addr)))
                return -EFAULT;

        return __put_user(val, addr) ? : __put_user(val, addr + 1);
}

00000000 <test3>:
   0:   81 42 04 38     lwz     r10,1080(r2)
   4:   7f 8a 20 40     cmplw   cr7,r10,r4
   8:   41 9c 00 48     blt     cr7,50 <test3+0x50>
   c:   7d 04 50 50     subf    r8,r4,r10
  10:   39 20 ff f2     li      r9,-14
  14:   2b 88 00 02     cmplwi  cr7,r8,2
  18:   40 9d 00 30     ble     cr7,48 <test3+0x48>
  1c:   39 20 00 00     li      r9,0
  20:   90 64 00 00     stw     r3,0(r4)
  24:   2f 89 00 00     cmpwi   cr7,r9,0
  28:   40 9e 00 20     bne     cr7,48 <test3+0x48>
  2c:   38 84 00 04     addi    r4,r4,4
  30:   7f 8a 20 40     cmplw   cr7,r10,r4
  34:   41 9c 00 1c     blt     cr7,50 <test3+0x50>
  38:   7d 44 50 50     subf    r10,r4,r10
  3c:   2b 8a 00 02     cmplwi  cr7,r10,2
  40:   40 9d 00 10     ble     cr7,50 <test3+0x50>
  44:   90 64 00 00     stw     r3,0(r4)
  48:   7d 23 4b 78     mr      r3,r9
  4c:   4e 80 00 20     blr
  50:   39 20 ff f2     li      r9,-14
  54:   4b ff ff f4     b       48 <test3+0x48>

Disassembly of section .text.test4:

00000000 <test4>:
   0:   81 22 04 38     lwz     r9,1080(r2)
   4:   7f 89 20 40     cmplw   cr7,r9,r4
   8:   41 9c 00 34     blt     cr7,3c <test4+0x3c>
   c:   7d 44 48 50     subf    r10,r4,r9
  10:   39 20 ff f2     li      r9,-14
  14:   2b 8a 00 06     cmplwi  cr7,r10,6
  18:   40 9d 00 1c     ble     cr7,34 <test4+0x34>
  1c:   39 20 00 00     li      r9,0
  20:   90 64 00 00     stw     r3,0(r4)
  24:   2f 89 00 00     cmpwi   cr7,r9,0
  28:   40 9e 00 0c     bne     cr7,34 <test4+0x34>
  2c:   38 84 00 04     addi    r4,r4,4
  30:   90 64 00 00     stw     r3,0(r4)
  34:   7d 23 4b 78     mr      r3,r9
  38:   4e 80 00 20     blr
  3c:   39 20 ff f2     li      r9,-14
  40:   4b ff ff f4     b       34 <test4+0x34>

Reply via email to