Hi, $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/home/rk/gcc/libexec/gcc/x86_64-unknown-linux-gnu/4.5.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ./configure --prefix=/home/rk/gcc --enable-languages=c,c++ Thread model: posix gcc version 4.5.0 20100112 (experimental) (GCC) $ cat test.c _Decimal64 foo(_Decimal64 a,_Decimal64 b) { return a+b; } $ gcc -c -O3 -save-temps test.c $ cat test.s ... foo: subq $24, %rsp movq %xmm0, 8(%rsp) call __bid_adddd3 movq %xmm0, 8(%rsp) addq $24, %rsp ret ... $ gcc -c -O3 -funsafe-math-optimizations -save-temps test.c $ cat test.s ... foo: subq $24, %rsp movq %xmm0, 8(%rsp) movq %xmm1, 8(%rsp) movq 8(%rsp), %rcx movdqa %xmm0, %xmm1 movq %rcx, 8(%rsp) movq 8(%rsp), %xmm0 call __bid_adddd3 movq %xmm0, 8(%rsp) addq $24, %rsp ret ...
Is there a good reason to place something on the stack? Why does -funsafe-math-optimizations (which is a part of -ffast-math) make things even worse? It actually swaps the arguments for __bid_adddd3().
Thanks :)