https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79593
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |uros at gcc dot gnu.org --- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> --- That said, the reason why there is fld1 followed by fld %st(0) is that 1.0 is used multiple times: (insn 41 64 42 8 (set (reg:SF 114) (mem/u/c:SF (symbol_ref/u:SI ("*.LC1") [flags 0x2]) [4 S4 A32])) "pr79593.c":17 125 {*movsf_internal} (expr_list:REG_EQUAL (const_double:SF 1.0e+0 [0x0.8p+1]) (nil))) (insn 42 41 43 8 (set (reg:XF 118 [ delta ]) (float_extend:XF (reg:SF 114))) "pr79593.c":17 153 {*extendsfxf2_i387} (expr_list:REG_EQUAL (const_double:XF 1.0e+0 [0x0.8p+1]) (nil))) ... (insn 69 65 47 9 (set (reg:XF 110 [ delta ]) (float_extend:XF (reg:SF 114))) "pr79593.c":17 153 {*extendsfxf2_i387} (expr_list:REG_DEAD (reg:SF 114) (expr_list:REG_EQUAL (const_double:XF 1.0e+0 [0x0.8p+1]) (nil)))) in multiple basic blocks with conditional jump in between, so the combiner doesn't combine it into (set (reg:XF ...)) (const_double:XF 1.0e+0). Still in *.peephole2 we have: (insn 82 64 42 8 (set (reg:SF 10 st(2) [114]) (const_double:SF 1.0e+0 [0x0.8p+1])) "pr79593.c":17 125 {*movsf_internal} (expr_list:REG_EQUAL (const_double:SF 1.0e+0 [0x0.8p+1]) (nil))) (insn 42 82 83 8 (set (reg:XF 9 st(1) [orig:118 delta ] [118]) (float_extend:XF (reg:SF 10 st(2) [114]))) "pr79593.c":17 153 {*extendsfxf2_i387} (expr_list:REG_EQUIV (const_double:XF 1.0e+0 [0x0.8p+1]) (nil))) ... (insn 69 65 47 9 (set (reg:XF 8 st [orig:110 delta ] [110]) (float_extend:XF (reg:SF 10 st(2) [114]))) "pr79593.c":17 153 {*extendsfxf2_i387} (expr_list:REG_DEAD (reg:SF 10 st(2) [114]) (expr_list:REG_EQUAL (const_double:XF 1.0e+0 [0x0.8p+1]) (nil)))) It is only the regstack pass that optimizes those 2 into 1, but that isn't able to peephole or otherwise combine: (insn:TI 82 64 42 7 (set (reg:SF 8 st) (const_double:SF 1.0e+0 [0x0.8p+1])) "pr79593.c":17 125 {*movsf_internal} (expr_list:REG_EQUAL (const_double:SF 1.0e+0 [0x0.8p+1]) (nil))) (insn:TI 42 82 83 7 (set (reg:XF 8 st) (float_extend:XF (reg:SF 8 st))) "pr79593.c":17 153 {*extendsfxf2_i387} (expr_list:REG_EQUIV (const_double:XF 1.0e+0 [0x0.8p+1]) (nil))) and there is no peephole2 pass afterwards, so either regstack itself would need to do this, or the machine reorg pass. Still no idea why this is considered a regression, I get with gcc 5.4.1 20160721 subl $12, %esp fldz movl 16(%esp), %edx movl 20(%esp), %eax cmpl %eax, (%edx) jbe .L2 flds global_data flds global_data+4 fxch %st(2) fcomp %st(1) fnstsw %ax sahf ja .L13 fxch %st(1) fsubrs 4(%edx) .L5: fdivp %st, %st(1) ftst fnstsw %ax sahf jnb .L6 fstp %st(0) fldz .L6: fld1 fld %st(0) fcomp %st(2) fnstsw %ax sahf jnb .L14 fstp %st(1) jmp .L7 .p2align 4,,10 .p2align 3 .L14: fstp %st(0) .L7: .L2: addl $12, %esp ret