On Thu, 13 Nov 2008, Szak�ts Viktor wrote: Hi Viktor,
> I've just peeked into the implementation and did some tests. > New dynamic FM STAT when turned off is about 8% slower > than fully turned off FM STAT. Quite huge overhead :( Can you show me your tests? I cannot imagine such big overhead in this case. I also made some tests. Below I'm attaching two. The test is done in pure C to maximize the difference. Allocated block is immediately freed and it's size is very small (4 bytes) to allocate the smallest chunk. Such maximized overhead is 7.54% and it's not realistic test but test where even single if is noticeable because program only makes: while( --ulLoop ) hb_xfree( hb_xgrab( 4 ) ); It means that in .prg code it has to be less then 1%. In the second test I cannot notice any difference which I can even measure. The difference is such small that code linked with FM which is not activated sometimes is faster then the code without FM stat. If you have bigger difference then I suggest to seriously check why. Some of your speed results you were sending in the past are really strange. I would be very careful with using MSVC for such calculation. Windows is also not good OS for it because the same tests are hardly repeatable. > Looks like a few branches can make such big difference > (similar to HB_NO_DEBUG). yes branches can make difference but not such huge as in your test if it was .prg code. During PCODE evaluation we have many conditional jumps and few new ones cannot make such big difference. > It means this isn't a costless feature, and thus cannot > replace our current solutions. No it isn't. It will always cost but for code which makes also other things then hb_xfree( hb_xgrab( 4 ) ) in a loop the difference have to be smaller. > Do you see any chance to somehow lessen this overhead? > (maybe using indirect pointers instead of branching?) Few conditional instruction can be eliminated by inlining some macros but 1-st I would like to see your results from attached tests and your test code so I can try it myself. Here I my results: tst01 ===== NOFMSTAT: Startup loop to increase CPU clock... testing... memory hb_xfree( hb_xgrab( ... ) ): 6.76 sec. FMSTAT (NOT ACTIVE): Startup loop to increase CPU clock... testing... memory hb_xfree( hb_xgrab( ... ) ): 7.27 sec. FMSTAT (ACTIVE): Startup loop to increase CPU clock... testing... memory hb_xfree( hb_xgrab( ... ) ): 22.26 sec. tst01 ===== NOFMSTAT: Startup loop to increase CPU clock... empty loop overhead 0.83 sec. memory hb_xfree( hb_xgrab( ... ) ): 1.64 sec. FMSTAT (NOT ACTIVE): Startup loop to increase CPU clock... empty loop overhead 0.82 sec. memory hb_xfree( hb_xgrab( ... ) ): 1.65 sec. FMSTAT (ACTIVE): Startup loop to increase CPU clock... empty loop overhead 0.83 sec. memory hb_xfree( hb_xgrab( ... ) ): 3.34 sec. best regards, Przemek /***** tst01.prg *****/ #define N_LOOP 100000000 proc main() local t ? "Startup loop to increase CPU clock..." t := seconds() + 5; while seconds() < t; enddo ? "testing..." t := secondsCPU() AllocTest( N_LOOP ) t := secondsCPU() - t ? "memory hb_xfree( hb_xgrab( ... ) ):", t, "sec." return request HB_GT_STD_DEFAULT #pragma begindump HB_FUNC( ALLOCTEST ) { ULONG ulLoop = hb_parnl( 1 ) + 1; while( --ulLoop ) hb_xfree( hb_xgrab( 4 ) ); } #pragma enddump /***** tst02.prg *****/ #define N_LOOP 10000000 proc main() local i, t, tn ? "Startup loop to increase CPU clock..." t := seconds() + 5; while seconds() < t; enddo tn := secondsCPU() for i:=1 to N_LOOP next tn := secondsCPU() - tn ? "empty loop overhead", tn, "sec." t := secondsCPU() for i:=1 to N_LOOP AllocTest( 1 ) next t := secondsCPU() - t ? "memory hb_xfree( hb_xgrab( ... ) ):", t - tn, "sec." return request HB_GT_STD_DEFAULT #pragma begindump HB_FUNC( ALLOCTEST ) { ULONG ulLoop = hb_parnl( 1 ) + 1; while( --ulLoop ) hb_xfree( hb_xgrab( 4 ) ); } #pragma enddump _______________________________________________ Harbour mailing list Harbour@harbour-project.org http://lists.harbour-project.org/mailman/listinfo/harbour