Pianoman wrote:

>     Hello everyone, I have got interesting results from my test of SSE2
> switch.
> I tested a small code fragment with and without SSE2 swith and I Saw no
> difference.
> Why is this?
> Iused FPC 2.0 on WIN 32 to compile it with Best optimization
> Code fragment follows:
> {$fputype sse2}
> program divide;
> uses windows;
> var time1,time2:dword;
> a,b,c:double;
> i,j:dword;
> begin
> a:=123456789.1234567;
> b:=1.987654587;
> for i:=1 to 3 do
> begin
> time1:=gettickcount;
> for j:=1 to 100000000 do c:=a/b;

div is an expensive operation and fpu and sse share the division
execution unit so why should it be faster?

The code uses sse2 of course:

# [13] for j:=1 to 100000000 do c:=a/b;
        movl    $1,%esi
        decl    %esi
        .balign 4
.L17:
        incl    %esi
        movsd   U_P$DIVIDE_A,%xmm0
        divsd   U_P$DIVIDE_B,%xmm0
        movsd   %xmm0,U_P$DIVIDE_C
        cmpl    $100000000,%esi
        jb      .L17



_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Reply via email to