Hi! On Wed, Aug 21, 2019 at 02:59:59PM +0530, Santosh Sivaraj wrote: > except for a couple of calls (1 or 2 nsec reduction), there are no > improvements in the call times. Or is 10 nsec the minimum granularity?? > > So I don't know if its even worth updating vdso64 except to keep vdso32 and > vdso64 equal.
Calls are cheap, in principle... It is the LR stuff that can make it slower on some cores, and a lot of calling sequence stuff may have considerable overhead of course. > +.macro get_datapage ptr, tmp > + bcl 20,31,888f > +888: > + mflr \ptr > + addi \ptr, \ptr, __kernel_datapage_offset - 888b > + lwz \tmp, 0(\ptr) > + add \ptr, \tmp, \ptr > +.endm (You can just write that as bcl 20,31,$+4 mflr \ptr etc. Useless labels are useless :-) ) One thing you might want to do to improve performance is to do this without the bcl etc., because you cannot really hide the LR latency of that. But that isn't very many ns either... Superscalar helps, OoO helps, but it is mostly just that >100MHz helps ;-) Segher