Hi Przemek,

I've done the tests, MT speed got ~5% better.
The MT overhead for MSVS 2008 on my system is 41%.

How much do you think this could be further enhanced
in case we'd pass the HVM context to all Harbour
API functions?

ST ST MT MT ST ST MT MT ST MT DLALLOC STD DLALLOC STD DLALLOC STD DLALLOC STD DLALLOC DLALLOC TLS TLS TLS TLS CTLS CTLS ======================================= -------- -------- -------- -------- -------- -------- -------- -------- -------- -------- c:=L_C -> 0.08 0.09 0.30 0.30 0.06 0.09 0.19 0.19 0.08 0.19 n:=L_N -> 0.06 0.09 0.16 0.19 0.05 0.06 0.08 0.08 0.05 0.06 d:=L_D -> 0.05 0.06 0.17 0.16 0.06 0.06 0.08 0.09 0.06 0.08 c:=M_C -> 0.09 0.09 0.31 0.31 0.09 0.11 0.19 0.20 0.09 0.20 n:=M_N -> 0.05 0.08 0.19 0.19 0.06 0.08 0.09 0.08 0.06 0.08 d:=M_D -> 0.08 0.06 0.20 0.20 0.09 0.06 0.09 0.08 0.09 0.09 (sh) c:=F_C -> 0.23 0.55 0.67 0.64 0.23 0.53 0.50 0.53 0.22 0.53 (sh) n:=F_N -> 0.34 0.38 0.50 0.45 0.34 0.38 0.36 0.34 0.33 0.38 (sh) d:=F_D -> 0.19 0.17 0.31 0.28 0.19 0.17 0.19 0.17 0.16 0.20 (ex) c:=F_C -> 0.22 0.48 0.66 0.63 0.25 0.50 0.50 0.53 0.19 0.53 (ex) n:=F_N -> 0.36 0.36 0.52 0.45 0.34 0.38 0.34 0.34 0.33 0.41 (ex) d:=F_D -> 0.19 0.17 0.30 0.28 0.19 0.19 0.19 0.17 0.16 0.20 n:=o:GenCode -> 0.22 0.28 0.75 0.77 0.23 0.27 0.38 0.36 0.22 0.38 n:=o[8] -> 0.17 0.19 0.45 0.45 0.16 0.20 0.27 0.28 0.17 0.27 round(i/1000,2) -> 0.38 0.41 0.94 0.91 0.38 0.41 0.39 0.44 0.38 0.45 str(i/1000) -> 0.72 1.02 1.45 1.53 0.70 0.97 1.02 1.06 0.70 1.00 val(a3[i%ARR_LEN+1]) -> 0.78 0.78 1.59 1.58 0.80 0.78 0.98 0.98 0.81 1.02 dtos(j+i%10000-5000) -> 0.73 0.97 1.72 1.89 0.72 0.97 1.09 1.16 0.72 1.03 eval({||i%ARR_LEN}) -> 0.39 0.39 1.11 1.14 0.41 0.36 0.55 0.53 0.41 0.50 eval({|x|x%ARR_LEN},i) -> 0.41 0.41 1.20 1.20 0.41 0.41 0.56 0.56 0.44 0.56 eval({|x|f1(x)},i) -> 0.53 0.56 2.39 1.97 0.52 0.56 0.73 0.70 0.53 0.70 &('f1('+str(i)+')') -> 3.88 9.13 8.94 8.70 6.91 6.88 8.94 7.25 3.84 6.41 eval([&('{|x|f1(x)}')]) -> 0.55 0.56 2.44 1.97 0.55 0.55 0.73 0.75 0.55 0.72 j := valtype(a)+valtype(i) -> 0.52 0.84 1.94 2.02 1.45 0.81 1.00 1.05 0.53 0.98 j := str(i%100,2) $ a2[i%ARR_LEN+1] -> 1.19 1.52 2.84 2.89 1.19 1.50 1.72 1.77 1.25 1.73 j := val(a2[i%ARR_LEN+1]) -> 0.84 0.84 1.86 1.81 0.84 0.84 1.06 1.05 0.86 1.06 j := a2[i%ARR_LEN+1] == s -> 0.53 0.56 1.39 1.38 0.53 0.55 0.84 0.84 0.52 0.83 j := a2[i%ARR_LEN+1] = s -> 0.58 0.61 1.47 1.52 0.59 0.61 0.89 0.88 0.56 0.89 j := a2[i%ARR_LEN+1] >= s -> 0.58 0.59 1.50 1.50 0.56 0.59 0.89 0.88 0.55 0.86 j := a2[i%ARR_LEN+1] < s -> 0.56 0.61 1.47 1.50 0.58 0.59 0.88 0.88 0.53 0.86 aadd(aa,{i,j,s,a,a2,t,bc}) -> 1.72 3.59 4.22 5.80 1.75 3.59 3.14 4.59 1.66 3.16 f0() -> 0.22 0.22 0.70 0.70 0.22 0.22 0.23 0.27 0.23 0.23 f1(i) -> 0.28 0.28 0.95 0.98 0.28 0.28 0.36 0.34 0.27 0.30 f2(c[8]) -> 0.25 0.23 0.89 0.86 0.20 0.23 0.39 0.44 0.27 0.38 f2(c[40000]) -> 0.25 0.27 1.05 0.84 0.25 0.27 0.38 0.38 0.28 0.34 f2(@c[40000]) -> 0.22 0.22 1.28 0.81 0.22 0.22 0.27 0.25 0.20 0.25 f2(c[40000]); c2:=c -> 0.34 0.34 1.17 1.13 0.34 0.34 0.58 0.56 0.31 0.53 f2(@c[40000]); c2:=c -> 0.31 0.31 1.17 1.11 0.31 0.33 0.45 0.44 0.30 0.42 f3(a,a2,c,i,j,t,bc) -> 0.55 0.55 2.00 2.03 0.52 0.55 0.97 0.95 0.56 0.97 f2(a2) -> 0.25 0.23 0.89 0.86 0.25 0.25 0.38 0.38 0.27 0.38 s:=f4() -> 1.33 2.13 2.36 2.92 1.31 2.11 1.66 2.17 1.48 1.67 s:=f5() -> 0.42 0.70 1.53 1.52 0.73 0.69 0.78 0.81 0.42 0.75 ascan(a,i%ARR_LEN) -> 0.50 0.50 1.23 1.19 0.73 0.52 0.66 0.66 0.53 0.63 ascan(a2,c+chr(i%64+64)) -> 1.64 2.06 3.58 3.77 1.64 2.05 2.03 2.27 1.72 2.08 ascan(a,{|x|x==i%ARR_LEN}) -> 4.45 4.91 16.75 13.84 4.45 4.91 6.44 6.61 4.30 6.45 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = ======================================================================== total application time: 32.52 44.42 94.39 92.02 36.98 41.98 47.95 49.63 32.52 46.00 total real time: 32.64 44.61 94.70 93.53 37.14 42.16 48.09 49.70 32.88 46.27

Brgds,
Viktor

On 2008.09.23., at 15:33, Przemyslaw Czerpak wrote:

On Tue, 23 Sep 2008, Szak�ts Viktor wrote:

Hi Viktor,

Many hanks for your tests.
Now the MT overhead is much smaller. I'll commit last
modification with TLS buffering and this will be the
final version. Probably we can improve the speed
yet a little bit but it will not be anything hardly noticeable.
Now in my Linux with assembler macros for counters, native
TLS support and TLS preloading MT programs are ~11% slower
then MT ones in speedtst.prg.
Please check the difference for MSVC.

BTW, can we change memtst to output to stdout,
not require key presses, and also for both tests
to include build-time switches in their outputs.
These would ease testing.

use //build switch and if necessary redirect stderr to
stdout by //stderr:1

best regards,
Przemek
_______________________________________________
Harbour mailing list
Harbour@harbour-project.org
http://lists.harbour-project.org/mailman/listinfo/harbour

_______________________________________________
Harbour mailing list
Harbour@harbour-project.org
http://lists.harbour-project.org/mailman/listinfo/harbour

Reply via email to