priv.onet.pl)

Mindaugas Kavaliauskas Wed, 01 Oct 2008 00:33:10 -0700

Przemyslaw Czerpak wrote:

The only reason I see for binding stack preload with "no tls" is that stackpreload also uses inlined Windows like function to access tls. But I see itas to separate features stack: stack preload and tls access method(compiler native or system API)?
When compiler native TLS is disabled and file define HB_STACK_PRELOAD
before including harbour header files then each function which have
to access hb_stack buffers it's address by HB_STACK_TLS_PRELOAD.
If possible assembler inline function is used to retrieve stack address
which is a little bit faster then call to OS TLS function and even native
TLS support in some compilers (f.e.BCC).
Compile current SVN code without any additional switches and compare
the tstspeed.prg results to previous ones.

Hi,

thanks for explanation. I just want to run all tests one after anotherto make results comparable. Because sometimes numbers obtained a fewdays ago can be calculated in a different OS memory/CPU usage state, soresults can give a few seconds difference.I've used -DHB_USE_TLS to obtain "previous" results (use compiler nativeTLS, no stack preloading).


The results are:

10/01/08 10:03:30 Harbour 1.1.0dev (Rev. 9523), Windows XP 5.1.2600Service Pack 2


ARR_LEN =         16                      ST      MT      MT
N_LOOPS =    1000000                            USE_TLS
empty loops overhead =                   0.19    0.30    0.28
CPU usage -> secondsCPU()

c:=L_C ->                                0.19    0.39    0.31
n:=L_N ->                                0.19    0.27    0.19
d:=L_D ->                                0.22    0.27    0.19
c:=M_C ->                                0.23    0.45    0.38
n:=M_N ->                                0.20    0.30    0.23
d:=M_D ->                                0.22    0.30    0.23
(sh) c:=F_C ->                           0.38    0.81    0.84
(sh) n:=F_N ->                           0.58    0.61    0.64
(sh) d:=F_D ->                           0.30    0.34    0.36
(ex) c:=F_C ->                           0.38    0.81    0.83
(ex) n:=F_N ->                           0.56    0.64    0.66
(ex) d:=F_D ->                           0.30    0.34    0.33
n:=o:GenCode ->                          0.45    0.81    0.78
n:=o[8] ->                               0.42    0.63    0.52
round(i/1000,2) ->                       0.63    0.92    0.81
str(i/1000) ->                           1.50    2.27    2.03
val(a3[i%ARR_LEN+1]) ->                  1.36    1.84    1.64
dtos(j+i%10000-5000) ->                  1.39    2.06    2.03
eval({||i%ARR_LEN}) ->                   0.69    1.03    0.89
eval({|x|x%ARR_LEN},i) ->                0.78    1.20    1.02
eval({|x|f1(x)},i) ->                    1.28    1.81    1.42
&('f1('+str(i)+')') ->                   7.66   15.13   13.14
eval([&('{|x|f1(x)}')]) ->               1.25    1.81    1.39
j := valtype(a)+valtype(i) ->            1.08    2.00    1.86
j := str(i%100,2) $ a2[i%ARR_LEN+1] ->   2.27    3.45    3.02
j := val(a2[i%ARR_LEN+1]) ->             1.55    2.11    1.91
j := a2[i%ARR_LEN+1] == s ->             1.06    1.70    1.50
j := a2[i%ARR_LEN+1] = s ->              1.11    1.69    1.58
j := a2[i%ARR_LEN+1] >= s ->             1.17    1.67    1.52
j := a2[i%ARR_LEN+1] < s ->              1.13    1.67    1.53
aadd(aa,{i,j,s,a,a2,t,bc}) ->            4.38    5.92    5.81
f0() ->                                  0.33    0.55    0.42
f1(i) ->                                 0.55    0.89    0.64
f2(c[8]) ->                              0.45    0.81    0.63
f2(c[40000]) ->                          0.47    0.80    0.64
f2(@c[40000]) ->                         0.36    0.64    0.47
f2(c[40000]); c2:=c ->                   0.69    1.19    1.00
f2(@c[40000]); c2:=c ->                  0.56    1.03    0.88
f3(a,a2,c,i,j,t,bc) ->                   1.16    2.05    1.70
f2(a2) ->                                0.44    0.81    0.66
s:=f4() ->                               2.13    2.56    2.47
s:=f5() ->                               0.84    1.38    1.27
ascan(a,i%ARR_LEN) ->                    0.73    1.25    1.06
ascan(a2,c+chr(i%64+64)) ->              2.47    3.67    3.33
ascan(a,{|x|x==i%ARR_LEN}) ->           10.95   13.48   11.66
=============================================================
total application time:                 64.95  100.00   89.30
total real time:                        65.89  100.97   90.36

Previous MT overhead was 54%, current 37%.

One thing is not clear for me. You've committed exactly the same inlinedtls accessing as I've used in my test, but your code does not GPF. Mywas GPFing because of wrong generated CPU code.



Best regards,
Mindaugas

_______________________________________________
Harbour mailing list
Harbour@harbour-project.org
http://lists.harbour-project.org/mailman/listinfo/harbour

Re: [Harbour] 2008-09-15 13:38 UTC+0200 Przemyslaw Czerpak (druzus/at/priv.onet.pl)

Reply via email to