Pianoman wrote: > Hello everyone, I have got interesting results from my test of SSE2 > switch. > I tested a small code fragment with and without SSE2 swith and I Saw no > difference. > Why is this? > Iused FPC 2.0 on WIN 32 to compile it with Best optimization > Code fragment follows: > {$fputype sse2} > program divide; > uses windows; > var time1,time2:dword; > a,b,c:double; > i,j:dword; > begin > a:=123456789.1234567; > b:=1.987654587; > for i:=1 to 3 do > begin > time1:=gettickcount; > for j:=1 to 100000000 do c:=a/b;
div is an expensive operation and fpu and sse share the division execution unit so why should it be faster? The code uses sse2 of course: # [13] for j:=1 to 100000000 do c:=a/b; movl $1,%esi decl %esi .balign 4 .L17: incl %esi movsd U_P$DIVIDE_A,%xmm0 divsd U_P$DIVIDE_B,%xmm0 movsd %xmm0,U_P$DIVIDE_C cmpl $100000000,%esi jb .L17 _______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal