https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105453

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2022-05-03
      Known to work|                            |12.0
           Keywords|                            |missed-optimization,
                   |                            |needs-bisection
             Status|UNCONFIRMED                 |NEW

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
With gcc 12 we get

_Z6func_1v:
.LFB0:
        .cfi_startproc
        movl    g_6(%rip), %eax
        movl    %eax, g_10(%rip)
        cmpl    $0, g_6+4(%rip)
        movl    $1, %edx
        cmove   %edx, %eax
        ret

the difference is that we somehow un-CSE the g_6 load at RTL expansion with GCC
11 and then things go downhill later:

;; Generating RTL for gimple basic block 2

;; _1 = g_6[0];

(insn 6 5 7 (set (reg/f:DI 84)
        (symbol_ref:DI ("g_6") [flags 0x2]  <var_decl 0x7ffff7ff3bd0 g_6>))
"t.c":7:20 -1
     (nil))

(insn 7 6 0 (set (reg:SI 83 [ <retval> ])
        (mem/c:SI (reg/f:DI 84) [1 g_6[0]+0 S4 A64])) "t.c":7:20 -1
     (nil))

;; g_10 = _1;

(insn 8 7 0 (set (mem/c:SI (symbol_ref:DI ("g_10") [flags 0x2]  <var_decl
0x7ffff7ff3c60 g_10>) [1 g_10+0 S4 A32])
        (reg:SI 83 [ <retval> ])) "t.c":7:12 -1
     (nil))

;; if (_2 != 0)

(insn 9 8 10 (set (reg/f:DI 85)
        (symbol_ref:DI ("g_6") [flags 0x2]  <var_decl 0x7ffff7ff3bd0 g_6>))
"t.c":8:14 -1
     (nil))

(insn 10 9 11 (set (reg:CCZ 17 flags)
        (compare:CCZ (mem/c:SI (plus:DI (reg/f:DI 85)
                    (const_int 4 [0x4])) [1 g_6[1]+0 S4 A32])
            (const_int 0 [0]))) "t.c":8:5 -1
     (nil))


the same happens with GCC 12.  CSE cleans this up so it's maybe not important
but in GCC 11 forwprop then does

     4: NOTE_INSN_BASIC_BLOCK 2
     2: NOTE_INSN_FUNCTION_BEG
-    6: r84:DI=`g_6'
-    7: r83:SI=[r84:DI]
-      REG_DEAD r84:DI
+    7: r83:SI=[`g_6']
     8: [`g_10']=r83:SI
-    9: r85:DI=`g_6'
-   10: flags:CCZ=cmp([r84:DI+0x4],0)
-      REG_DEAD r85:DI
+   10: flags:CCZ=cmp([const(`g_6'+0x4)],0)

which seems to confuse CE enough to emit the extra load:

     4: NOTE_INSN_BASIC_BLOCK 2
     2: NOTE_INSN_FUNCTION_BEG
     7: r83:SI=[`g_6']
     8: [`g_10']=r83:SI
-   10: flags:CCZ=cmp([const(`g_6'+0x4)],0)
-   11: pc={(flags:CCZ==0)?L24:pc}
-      REG_DEAD flags:CCZ
-      REG_BR_PROB 708669604
-      ; pc falls through to BB 4
-   24: L24:
-   23: NOTE_INSN_BASIC_BLOCK 3
-    3: r83:SI=0x1
-   17: L17:
-   20: NOTE_INSN_BASIC_BLOCK 4
+   25: r88:SI=[`g_6']
+   26: r87:SI=0x1
+   27: flags:CCZ=cmp([const(`g_6'+0x4)],0)
+   28: r83:SI={(flags:CCZ==0)?r87:SI:r88:SI}
    18: ax:SI=r83:SI

disabling fwprop1 produces

+    6: r84:DI=`g_6'
+    7: r83:SI=[r84:DI]
+      REG_DEAD r84:DI
+    8: [`g_10']=r83:SI
+   26: r87:SI=0x1
+   25: flags:CCZ=cmp([r84:DI+0x4],0)
+   27: r83:SI={(flags:CCZ!=0)?r83:SI:r87:SI}
    18: ax:SI=r83:SI

again.  Likely the new SSA based forwprop "fixed" this, but maybe only
accidentially.  On trunk fwprop1 does

-    6: r84:DI=`g_6'
-    7: r83:SI=[r84:DI]
-      REG_DEAD r84:DI
+    7: r83:SI=[`g_6']
     8: [`g_10']=r83:SI
-    9: r85:DI=`g_6'
-   10: flags:CCZ=cmp([r84:DI+0x4],0)
-      REG_DEAD r85:DI
+   10: flags:CCZ=cmp([const(`g_6'+0x4)],0)

Reply via email to