Le 20/01/2020 à 16:19, Segher Boessenkool a écrit :
On Mon, Jan 20, 2020 at 02:56:00PM +0000, Christophe Leroy wrote:
Nice!  Much better.

It should be tested on more representative hardware, too, but this looks
promising alright :-)

mpc832x (e300c2 core) at 333 MHz:

Before:

gettimeofday:    vdso: 235 nsec/call
clock-gettime-realtime:    vdso: 244 nsec/call

With the series:

gettimeofday:    vdso: 271 nsec/call
clock-gettime-realtime:    vdso: 281 nsec/call

Those are important, and degrade ~15%.  That is acceptable IMO, but do
you see a way to optimise this (later)?

Not easy I think.

First we have the unavoidable ASM entry function that can't be dropped because of the CR[SO] bit the set on error or clear on no error and that can't be done in C.

In our ASM VDSO, fixed shifts are used, while in generic C VDSO, shifts are generic and read from the VDSO data.

And there is still some funny code generated by GCC (8.1), like:

 620:   7d 29 3c 30     srw     r9,r9,r7
 624:   21 87 00 20     subfic  r12,r7,32
 628:   7d 07 3c 31     srw.    r7,r8,r7
 62c:   7d 08 60 30     slw     r8,r8,r12
 630:   7d 0b 4b 78     or      r11,r8,r9
 634:   39 40 00 00     li      r10,0
 638:   40 82 00 84     bne     6bc <__c_kernel_clock_gettime+0x114>
 63c:   81 23 00 24     lwz     r9,36(r3)
 640:   81 05 00 00     lwz     r8,0(r5)
...
 6bc:   7d 69 5b 78     mr      r9,r11
 6c0:   7c ea 3b 78     mr      r10,r7
 6c4:   7d 2b 4b 78     mr      r11,r9
 6c8:   4b ff ff 74     b       63c <__c_kernel_clock_gettime+0x94>

This branch to 6bc is totally useless:
- copying r11 into r9 is pointless as r9 is overwritten in 63c
- copying back r9 into r11 is pointless as r11 has not been modified inbetween. - loading r10 with 0 then overwritting r10 with r7 when r7 is not 0 is pointless as well, could have directly put the result of srw. in r10.

Christophe

Reply via email to