The testcase: int test (float a) { return (a*a); } produces non-optimal asm code, when compiled with '-O2 -msse -mfpmath=387': test: pushl %ebp movl %esp, %ebp subl $4, %esp flds 8(%ebp) fmul %st(0), %st fstps -4(%ebp) (*) movss -4(%ebp), %xmm0 (*) cvttss2si %xmm0, %eax leave ret
A cvttss2si insn could use memory input operand, so movss to %xmm0 is not needed. Following RTL sequence is generated in ._.23.greg: (insn 31 11 32 0 (set (mem:SF (plus:SI (reg/f:SI 6 bp) (const_int -4 [0xfffffffc])) [0 S4 A8]) (reg/v:SF 8 st [orig:59 a ] [59])) 61 {*movsf_1} (nil) (nil)) (insn 32 31 12 0 (set (reg:SF 21 xmm0) (mem:SF (plus:SI (reg/f:SI 6 bp) (const_int -4 [0xfffffffc])) [0 S4 A8])) 61 {*movsf_1} (nil) (nil)) (insn:HI 12 32 16 0 (set (reg:SI 0 ax [60]) (fix:SI (reg:SF 21 xmm0))) 112 {fix_truncsfsi_sse} (insn_list:REG_DEP_TRUE 11 (nil)) (nil)) It looks like gcc does not take into account the fact that cvttss2si can use memory input operand. -- Summary: Non-optimal code with cvttss2si Product: gcc Version: 4.0.0 Status: UNCONFIRMED Severity: normal Priority: P2 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: uros at kss-loka dot si CC: gcc-bugs at gcc dot gnu dot org GCC build triplet: i686-pc-linux-gnu GCC host triplet: i686-pc-linux-gnu GCC target triplet: i686-pc-linux-gnu