On 24 Jun 2011, at 19:16, Peter wrote:

> # [7] For X := 0  to 10000000 do
>    movl    $0,%eax
>    decl    %eax
>    .balign 4,0x90
> .Lj7:
>    incl    %eax
> # [9] A := A + X;
>    cvtsi2sdl    %eax,%xmm2
>    addsd    %xmm0,%xmm2
>    movsd    %xmm2,%xmm0
> # [10] A := A * B;
>    movsd    %xmm0,%xmm2
>    mulsd    %xmm1,%xmm2
>    movsd    %xmm2,%xmm0
>    cmpl    $10000000,%eax
>    jl    .Lj7
> # [14] end;
>    movsd    %xmm0,%xmm0
>    addq    $24,%rsp
>    ret
> 
> 
> I am wondering what is the point of all the xmm2 stuff

Variable A is a register variable. While evaluating "A + X" and "A * B", the 
code generator does not know that the final result will be stored back into A 
(nor that "A" won't be used again before the final result is written back), so 
it must make sure that A is not destroyed while performing these calculations.

Such inefficiencies are usually solved with integer code on i386 (and to some 
extent on PowerPC) using the peephole optimizer. There's no peephole optimizer 
for x86-64 though (and none for sse code, not even on i386). Most of those 
register transfers are also however pretty much free (processors rename 
registers internally all the time, even if you don't use explicit register 
moves in your code), except that they increase the icache pressure somewhat.

> Also puzzled by the final
> movsd %xmm0,%xmm0
> What does this do?


It probably means that the register size of one xmm0 is not the same as that of 
the other inside the compiler (e.g., one may specifically represent a 64 bit 
double while the other may represent "the entire xmm register"), and the 
compiler will only remove transfers between registers of exactly the same size 
(since otherwise some conversion may be going on; this optimization is 
performed by generic code that has no clue about the specific meaning of 
"movsd"). It means that the size of ether xmm0 register should be specified 
more precise somewhere in the compiler.

> I would really like to be able to generate optimal (ie minimal) xmm code from 
> Pascal without dropping into assembler. Are there any other compiler switches 
> that would help?


No.


Jonas_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Reply via email to