On Fri, 27 Mar 2009, Angel Pais wrote: Hi,
> On second machine (dual core) it doesn't gpf'd but results look very > strange to me. I do not find anythinbg strange in the results. > Pentium 4 3GZ 1GB RAM Dual Core > 03/27/2009 16:55:40 Windows XP 05.02 Build 03790 > Xbase++ (R) Version 1.90 (MT)+ > THREADS: 2 > N_LOOPS: 1000000 > 1 th. 2 th. factor > ============================================================================ > [ T001: x := L_C ]____________________________________ 0.17 0.25 -> 0.68 rather pure scalability. It's even smaller then 1.0. > [ T002: x := L_N ]____________________________________ 0.29 0.15 -> 1.93 > [ T003: x := L_D ]____________________________________ 0.21 0.11 -> 1.91 Quite good. Nearly 2.0 so it was nicely executed on two CPUs simultaneously. > [ T004: x := S_C ]____________________________________ 0.34 0.24 -> 1.42 > [ T005: x := S_N ]____________________________________ 0.17 0.19 -> 0.89 > [ T006: x := S_D ]____________________________________ 0.29 0.27 -> 1.07 > [ T007: x := M->M_C ]_________________________________ 0.66 0.46 -> 1.43 > [ T008: x := M->M_N ]_________________________________ 0.44 0.61 -> 0.72 > [ T009: x := M->M_D ]_________________________________ 0.50 0.36 -> 1.39 > [ T010: x := M->P_C ]_________________________________ 0.61 0.89 -> 0.69 > [ T011: x := M->P_N ]_________________________________ 0.42 0.39 -> 1.08 > [ T012: x := M->P_D ]_________________________________ 0.83 0.92 -> 0.90 > [ T013: x := F_C ]____________________________________ 0.92 0.85 -> 1.08 > [ T014: x := F_N ]____________________________________ 0.89 0.61 -> 1.46 > [ T015: x := F_D ]____________________________________ 0.89 0.72 -> 1.24 > [ T016: x := o:Args ]_________________________________ 0.45 0.70 -> 0.64 > [ T017: x := o[2] ]___________________________________ 0.27 0.22 -> 1.23 > [ T018: round( i / 1000, 2 ) ]________________________ 4.88 5.51 -> 0.89 > [ T019: str( i / 1000 ) ]_____________________________ 33.63 34.22 -> 0.98 > [ T020: val( s ) ]____________________________________ 0.97 1.36 -> 0.71 > [ T021: val( a [ i % 16 + 1 ] ) ]_____________________ 3.14 3.42 -> 0.92 > [ T022: dtos( d - i % 10000 ) ]_______________________ 3.83 3.81 -> 1.01 > [ T023: eval( { || i % 16 } ) ]_______________________ 3.86 4.56 -> 0.85 > [ T024: eval( bc := { || i % 16 } ) ]_________________ 2.19 2.24 -> 0.98 > [ T025: eval( { |x| x % 16 }, i ) ]___________________ 1.69 1.80 -> 0.94 > [ T026: eval( bc := { |x| x % 16 }, i ) ]_____________ 1.61 1.57 -> 1.03 > [ T027: eval( { |x| f1( x ) }, i ) ]__________________ 2.24 2.25 -> 1.00 > [ T028: eval( bc := { |x| f1( x ) }, i ) ]____________ 2.00 1.94 -> 1.03 > [ T029: eval( bc := &("{ |x| f1( x ) }"), i ) ]_______ 2.87 2.78 -> 1.03 > [ T030: x := &( 'f1(' + str(i) + ')' ) ]______________ 82.99 81.23 -> 1.02 > [ T031: bc := &( '{|x|f1(x)}' ), eval( bc, i ) ]______ 78.28 76.13 -> 1.03 > [ T032: x := valtype( x ) + valtype( i ) ]___________ 1.81 1.69 -> 1.07 > [ T033: x := strzero( i % 100, 2 ) $ a[ i % 16 + 1 ] ] 36.16 36.09 -> 1.00 > [ T034: x := a[ i % 16 + 1 ] == s ]___________________ 1.28 1.22 -> 1.05 > [ T035: x := a[ i % 16 + 1 ] = s ]____________________ 2.00 1.88 -> 1.06 > [ T036: x := a[ i % 16 + 1 ] >= s ]___________________ 1.89 1.92 -> 0.98 > [ T037: x := a[ i % 16 + 1 ] <= s ]___________________ 2.17 1.94 -> 1.12 > [ T038: x := a[ i % 16 + 1 ] < s ]____________________ 1.97 1.90 -> 1.04 > [ T039: x := a[ i % 16 + 1 ] > s ]____________________ 2.00 1.91 -> 1.05 > [ T040: ascan( a, i % 16 ) ]__________________________ 3.59 3.13 -> 1.15 > [ T041: ascan( a, { |x| x == i % 16 } ) ]_____________ 22.44 64.61 -> 0.35 > [ T042: iif( i%1000==0, a:={}, ), aadd(a,{i,1,.t.,s, ] 27.62 26.45 -> 1.04 > [ T043: x := a ]______________________________________ 0.21 0.25 -> 0.84 > [ T044: x := {} ]_____________________________________ 2.26 2.08 -> 1.09 > [ T045: f0() ]________________________________________ 0.72 0.44 -> 1.64 > [ T046: f1( i ) ]_____________________________________ 1.01 0.64 -> 1.58 > [ T047: f2( c[1...8] ) ]______________________________ 0.63 0.95 -> 0.66 > [ T048: f2( c[1...40000] ) ]__________________________ 0.67 0.78 -> 0.86 > [ T049: f2( @c[1...40000] ) ]_________________________ 0.68 0.47 -> 1.45 > [ T050: f2( @c[1...40000] ), c2 := c ]________________ 1.09 0.83 -> 1.31 > [ T051: f3( a, a2, s, i, s2, bc, i, n, x ) ]__________ 2.12 1.71 -> 1.24 > [ T052: f2( a ) ]_____________________________________ 0.57 1.16 -> 0.49 > [ T053: x := f4() ]___________________________________ 4.12 3.90 -> 1.06 > [ T054: x := f5() ]___________________________________ 2.07 1.86 -> 1.11 > [ T055: x := space(16) ]______________________________ 4.78 1.49 -> 3.21 This one is really strange but such things can happen and they are usually results of some other events, f.e. sth external to above program was executed suddenly in the 1-st part of test increasing the time or internal GC was activated. Repeating the test probably will change the results and this 3.21 factor. > [ T056: f_prv( c ) ]__________________________________ 2.27 3.03 -> 0.75 > ============================================================================ > [ TOTAL ]_________________________________________358.66 393.09 -> 0.91 > ============================================================================ > [ total application time: ]...................................751.90 > [ total real time: ]..........................................751.90 The average factor is 0.91. The ideal value should be 2.00 on two CPU machine when 2 or more threads are used. This is the real cost of serialization. It reduce scalability. Please note the the results depends on executed things. There are things which are scaled quite well and other which aren't. For sure the memory allocator used by xbase++ looks much better then DLMALLOC which in real MT mode (real simultaneous execution usually gives fatal results and it's much more efficient to compile windows Harbour builds with -DHB_FM_WIN_ALLOC. > CPU CELEREON 560 2.13 GB 1GB RAM > 03/27/2009 15:49:36 Windows XP 05.01 Build 02600 Service Pack 2 > Xbase++ (R) Version 1.90 (MT)+ > THREADS: 2 > N_LOOPS: 1000000 > 1 th. 2 th. factor > ============================================================================ > [ T001: x := L_C ]____________________________________ 0.08 0.08 -> 1.00 > [ T002: x := L_N ]____________________________________ 0.06 0.05 -> 1.20 > [ T003: x := L_D ]____________________________________ 0.04 0.05 -> 0.80 > [ T004: x := S_C ]____________________________________ 0.16 0.15 -> 1.07 > [ T005: x := S_N ]____________________________________ 0.11 0.14 -> 0.79 > [ T006: x := S_D ]____________________________________ 0.13 0.12 -> 1.08 > [ T007: x := M->M_C ]_________________________________ 0.32 0.29 -> 1.10 > [ T008: x := M->M_N ]_________________________________ 0.25 0.27 -> 0.93 > [ T009: x := M->M_D ]_________________________________ 0.23 0.25 -> 0.92 > [ T010: x := M->P_C ]_________________________________ 0.33 0.34 -> 0.97 > [ T011: x := M->P_N ]_________________________________ 0.30 0.30 -> 1.00 > [ T012: x := M->P_D ]_________________________________ 0.31 0.33 -> 0.94 > [ T013: x := F_C ]____________________________________ 0.48 0.47 -> 1.02 > [ T014: x := F_N ]____________________________________ 0.42 0.41 -> 1.02 > [ T015: x := F_D ]____________________________________ 0.41 0.39 -> 1.05 > [ T016: x := o:Args ]_________________________________ 0.29 0.29 -> 1.00 > [ T017: x := o[2] ]___________________________________ 0.10 0.10 -> 1.00 > [ T018: round( i / 1000, 2 ) ]________________________ 0.87 0.86 -> 1.01 > [ T019: str( i / 1000 ) ]_____________________________ 4.10 4.14 -> 0.99 > [ T020: val( s ) ]____________________________________ 0.58 0.59 -> 0.98 > [ T021: val( a [ i % 16 + 1 ] ) ]_____________________ 0.77 0.75 -> 1.03 > [ T022: dtos( d - i % 10000 ) ]_______________________ 1.03 1.03 -> 1.00 > [ T023: eval( { || i % 16 } ) ]_______________________ 1.86 1.84 -> 1.01 > [ T024: eval( bc := { || i % 16 } ) ]_________________ 1.19 1.05 -> 1.13 > [ T025: eval( { |x| x % 16 }, i ) ]___________________ 0.85 0.87 -> 0.98 > [ T026: eval( bc := { |x| x % 16 }, i ) ]_____________ 0.70 0.71 -> 0.99 > [ T027: eval( { |x| f1( x ) }, i ) ]__________________ 1.07 1.05 -> 1.02 > [ T028: eval( bc := { |x| f1( x ) }, i ) ]____________ 0.93 0.91 -> 1.02 > [ T029: eval( bc := &("{ |x| f1( x ) }"), i ) ]_______ 1.42 1.36 -> 1.04 > [ T030: x := &( 'f1(' + str(i) + ')' ) ]______________ 18.22 18.38 -> 0.99 > [ T031: bc := &( '{|x|f1(x)}' ), eval( bc, i ) ]______ 27.46 27.57 -> 1.00 > [ T032: x := valtype( x ) + valtype( i ) ]___________ 0.61 0.57 -> 1.07 > [ T033: x := strzero( i % 100, 2 ) $ a[ i % 16 + 1 ] ] 3.71 3.69 -> 1.01 > [ T034: x := a[ i % 16 + 1 ] == s ]___________________ 0.49 0.46 -> 1.07 > [ T035: x := a[ i % 16 + 1 ] = s ]____________________ 0.66 0.67 -> 0.99 > [ T036: x := a[ i % 16 + 1 ] >= s ]___________________ 0.67 0.69 -> 0.97 > [ T037: x := a[ i % 16 + 1 ] <= s ]___________________ 0.68 0.69 -> 0.99 > [ T038: x := a[ i % 16 + 1 ] < s ]____________________ 0.69 0.69 -> 1.00 > [ T039: x := a[ i % 16 + 1 ] > s ]____________________ 0.70 0.69 -> 1.01 > [ T040: ascan( a, i % 16 ) ]__________________________ 1.34 1.28 -> 1.05 > [ T041: ascan( a, { |x| x == i % 16 } ) ]_____________ 8.60 8.53 -> 1.01 > [ T042: iif( i%1000==0, a:={}, ), aadd(a,{i,1,.t.,s, ] 13.12 13.38 -> 0.98 > [ T043: x := a ]______________________________________ 0.07 0.06 -> 1.17 You can try to use --exclude=044 as additional parameter to finish the test. Here results are close to 1.0 what is expected for single CPU machine. But I'm finding one thing very interesting. They are much better then in the 1-st test though the computer seems to be slower. Am I right? If yes then it will suggest that xbase++ does not enable some internal MT logic in all cases and try to use some faster VM module/synchronization mechanism when it detects single CPU machine and/or process does not create new threads. It will be interesting to find when exactly such faster internal logic is used. It's possible that the GPF problem in T044 can be exploited only when this faster but not really MT safe module is enabled. It looks like some type of runtime/startup VM swithing. Thank you very much for your tests. What are Harbour results on the same computers? best regards, Przemek _______________________________________________ Harbour mailing list Harbour@harbour-project.org http://lists.harbour-project.org/mailman/listinfo/harbour