On 23 September 2015 16:05:02 GMT+10:00, Michael Neuling <mi...@neuling.org> 
wrote:
>The 32 and 64 bit variants of __get_datapage() use a "bcl; mflr" to
>determine the loaded address of the VDSO. The current version of these
>attempt to use the special bcl variant which avoids pushing to the
>link stack.
>
>Unfortunately it uses bcl+8 rather than the required bcl+4. Hence the
>current code results in link stack corruption and the resulting
>performance degradation (due to branch mis-prediction).
>
>This patch moves us to bcl+4 by moving __kernel_datapage_offset
>out of __get_datapage().
>
>With this patch, running the below benchmark we get a bump in
>performance on POWER8 for gettimeofday() (which uses
>__get_datapage()).
>
>64bit gets ~4% improvement:
>  Without patch:
>    # ./tb
>    time = 0.180321
>  With patch:
>    # ./tb
>    time = 0.187408
>
>32bit gets ~9% improvement:
>  Without patch:
>    # ./tb
>    time = 0.276551
>  With patch:
>    # ./tb
>    time = 0.252767
>
>Testcase tb.c (stolen from Anton)
>  /* gcc -O2 tb.c -o tb */
>  #include <sys/time.h>
>  #include <stdio.h>
>
>  int main()
>  {
>         int i;
>
>         struct timeval tv_start, tv_end;
>
>         gettimeofday(&tv_start, NULL);
>
>         for(i = 0; i < 10000000; i++) {
>                 gettimeofday(&tv_end, NULL);
>         }
>
>         printf("time = %.6f\n", tv_end.tv_sec - tv_start.tv_sec +
>(tv_end.tv_usec - tv_start.tv_usec) * 1e-6);
>
>         return 0;
>  }

You know where test cases are supposed to go.

I know it's not a pass/fail test, but it's still useful. If it's in the tree it 
will get run as part of automated test runs and we will have a record of the 
result over time.

cheers
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Reply via email to