Hi!

The following testcase is miscompiled on x86_64-linux at -O2 by the combiner.
We have from earlier combinations
(insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
        (const_int 0 [0])) "pr119291.c":25:15 96 {*movsi_internal}
     (nil))
(insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
        (reg/v:SI 116 [ e ])) 96 {*movsi_internal}
     (expr_list:REG_DEAD (reg/v:SI 116 [ e ])
        (nil)))
(note 24 23 25 4 NOTE_INSN_DELETED)
(insn 25 24 26 4 (parallel [
            (set (reg:CCZ 17 flags)
                (compare:CCZ (neg:SI (reg:SI 104 [ _7 ]))
                    (const_int 0 [0])))
            (set (reg/v:SI 116 [ e ])
                (neg:SI (reg:SI 104 [ _7 ])))
        ]) "pr119291.c":26:13 977 {*negsi_2}
     (expr_list:REG_DEAD (reg:SI 104 [ _7 ])
        (nil)))
(note 26 25 27 4 NOTE_INSN_DELETED)
(insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
        (ne:DI (reg:CCZ 17 flags)
            (const_int 0 [0]))) "pr119291.c":26:13 1447 {*setcc_di_1}
     (expr_list:REG_DEAD (reg:CCZ 17 flags)
        (nil)))
and try_combine is called on i3 25 and i2 22 (second time)
and reach the hunk being patched with simplified i3
(insn 25 24 26 4 (parallel [
            (set (pc)
                (pc))
            (set (reg/v:SI 116 [ e ])
                (const_int 0 [0]))
        ]) "pr119291.c":28:13 977 {*negsi_2}
     (expr_list:REG_DEAD (reg:SI 104 [ _7 ])
        (nil)))
and
(insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
        (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
     (nil))
Now, the try_combine code there attempts to split two independent
sets in newpat by moving one of them to i2.
And among other tests it checks
!modified_between_p (SET_DEST (set1), i2, i3)
which is certainly needed, if there would be say
(set (reg/v:SI 116 [ e ]) (const_int 42 [0x2a]))
in between i2 and i3, we couldn't do that, as that set would overwrite
the value set by set1 we want to move to the i2 position.
But in this case pseudo 116 isn't set in between i2 and i3, but used
(and additionally there is a REG_DEAD note for it).

This is equally bad for the move, because while the i3 insn
and later will see the pseudo value that we set, the insn in between
which uses the value will see a different value from the one that
it should see.

As we don't check for that, in the end try_combine succeeds and
changes the IL to:
(insn 22 21 23 4 (set (reg/v:SI 116 [ e ])
        (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
     (nil))
(insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
        (reg/v:SI 116 [ e ])) 96 {*movsi_internal}
     (expr_list:REG_DEAD (reg/v:SI 116 [ e ])
        (nil)))
(note 24 23 25 4 NOTE_INSN_DELETED)
(insn 25 24 26 4 (set (pc)
        (pc)) "pr119291.c":28:13 2147483647 {NOOP_MOVE}
     (nil))
(note 26 25 27 4 NOTE_INSN_DELETED)
(insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
        (const_int 0 [0])) "pr119291.c":28:13 95 {*movdi_internal}
     (nil))
(note, the i3 got turned into a nop and try_combine also modified insn 27).

The following patch replaces the modified_between_p
tests with reg_used_between_p, my understanding is that
modified_between_p is a subset of reg_used_between_p, so one
doesn't need both.

Bootstrapped/regtested on x86_64-linux, i686-linux, aarch64-linux,
powerpc64le-linux and s390x-linux, ok for trunk?

Looking at this some more today, I think we should special case
set_noop_p because that can be put into i2 (except for the JUMP_P
violations), currently both modified_between_p (pc_rtx, i2, i3)
and reg_used_between_p (pc_rtx, i2, i3) returns false.
I'll post a patch incrementally for that (but that feels like
new optimization, so probably not something that should be backported).

2025-03-28  Jakub Jelinek  <ja...@redhat.com>

        PR rtl-optimization/119291
        * combine.cc (try_combine): For splitting of PARALLEL with
        2 independent SETs into i2 and i3 sets check reg_used_between_p
        of the SET_DESTs rather than just modified_between_p.

        * gcc.c-torture/execute/pr119291.c: New test.

--- gcc/combine.cc.jj   2025-03-25 09:34:33.469102343 +0100
+++ gcc/combine.cc      2025-03-27 09:50:15.768567383 +0100
@@ -4012,18 +4012,18 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
       rtx set1 = XVECEXP (newpat, 0, 1);
 
       /* Normally, it doesn't matter which of the two is done first, but
-        one which uses any regs/memory set in between i2 and i3 can't
-        be first.  The PARALLEL might also have been pre-existing in i3,
-        so we need to make sure that we won't wrongly hoist a SET to i2
-        that would conflict with a death note present in there, or would
-        have its dest modified between i2 and i3.  */
+        one which uses any regs/memory set or used in between i2 and i3
+        can't be first.  The PARALLEL might also have been pre-existing
+        in i3, so we need to make sure that we won't wrongly hoist a SET
+        to i2 that would conflict with a death note present in there, or
+        would have its dest modified or used between i2 and i3.  */
       if (!modified_between_p (SET_SRC (set1), i2, i3)
          && !(REG_P (SET_DEST (set1))
               && find_reg_note (i2, REG_DEAD, SET_DEST (set1)))
          && !(GET_CODE (SET_DEST (set1)) == SUBREG
               && find_reg_note (i2, REG_DEAD,
                                 SUBREG_REG (SET_DEST (set1))))
-         && !modified_between_p (SET_DEST (set1), i2, i3)
+         && !reg_used_between_p (SET_DEST (set1), i2, i3)
          /* If I3 is a jump, ensure that set0 is a jump so that
             we do not create invalid RTL.  */
          && (!JUMP_P (i3) || SET_DEST (set0) == pc_rtx)
@@ -4038,7 +4038,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2,
               && !(GET_CODE (SET_DEST (set0)) == SUBREG
                    && find_reg_note (i2, REG_DEAD,
                                      SUBREG_REG (SET_DEST (set0))))
-              && !modified_between_p (SET_DEST (set0), i2, i3)
+              && !reg_used_between_p (SET_DEST (set0), i2, i3)
               /* If I3 is a jump, ensure that set1 is a jump so that
                  we do not create invalid RTL.  */
               && (!JUMP_P (i3) || SET_DEST (set1) == pc_rtx)
--- gcc/testsuite/gcc.c-torture/execute/pr119291.c.jj   2025-03-27 
09:48:01.917407084 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr119291.c      2025-03-27 
09:47:48.020598094 +0100
@@ -0,0 +1,33 @@
+/* PR rtl-optimization/119291 */
+
+int a;
+long c;
+
+__attribute__((noipa)) void
+foo (int x)
+{
+  if (x != 0)
+    __builtin_abort ();
+  a = 42;
+}
+
+int
+main ()
+{
+  int e = 1;
+lab:
+  if (a < 2)
+    {
+      int b = e;
+      _Bool d = a != 0;
+      _Bool f = b != 0;
+      unsigned long g = -(d & f);
+      unsigned long h = c & g;
+      unsigned long i = ~c;
+      e = -(i & h);
+      c = e != 0;
+      a = ~e + b;
+      foo (e);
+      goto lab;
+    }
+}

        Jakub

Reply via email to