TL;DR

In GCC 9.3, I believe modified_between_p should return 1 if the memory
reference is volatile. My layman's understanding of volatile memory is that
it could change at any time, and thus could be modified between any two
instructions.

Possible patch:
diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index 01af063a222..a395df0627b 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -1308,6 +1308,8 @@ modified_between_p (const_rtx x, const rtx_insn
*start, const rtx_insn *end)
       return 1;

     case MEM:
+      if (MEM_VOLATILE_P(x))
+        return 1;
       if (modified_between_p (XEXP (x, 0), start, end))
  return 1;
       if (MEM_READONLY_P (x))

Details
I'm writing a splitter to optimize bit field conditionals (e.g. testing a
bit of memory mapped I/O). The non-optimized pattern consists of a SET,
AND,  LSHIFTRT + CMP, JMP pattern which I know can be simplified to an
AND + JMP as the AND operation will set the necessary CC bits. To achieve
this optimization in the combine pass a custom general_operand has been
defined which allows volatile memory.

However in certain situations I've found the combine pass is generating
invalid patterns when it merges a SET and IF_THEN_ELSE (CMP + JMP)
patterns. One such situation is when GCC decides to reuse the result of the
comparison for the return value.

This issue seems intertwined with the combine + volatile_ok issue that has
been discussed a number of times in the past. The backend is question is
highly embedded and significant performance gains are made by enabling
volatile references during combine.

Optimized tree of the test program. In this program the bit field value is
saved in _1 to be reused in the return value. Why it makes this choice is
beyond me.
  <unnamed-unsigned:1> _1;
  _Bool _3;

  <bb 3> [local count: 1044213930]:
  _1 ={v} MEM[(volatile struct register_t *)5B].done;
  if (_1 != 0)
    goto <bb 4>; [11.00%]
  else
    goto <bb 3>; [89.00%]

  <bb 4> [local count: 1073741824]:
  // Additional code trimmed for brevity...

  <bb 6> [local count: 114863532]:
  # _3 = PHI <0(4), _1(5)>
  return _3;

This produces the following pre-combine RTL.
(insn 9 6 10 3 (parallel [
            (set (reg:QI 17 [ <retval> ])
                (lshiftrt:QI (mem/v:QI (const_int 5 [0x5]) [1 MEM[(volatile
struct register_t *)5B]+0 S1 A16])
                    (const_int 3 [0x3])))
            (clobber (reg:CC 7 FLAG))
        ]) "fail.c":22:26 19 {*lshrqi3_im}
     (expr_list:REG_DEAD (reg/f:QI 18)
        (expr_list:REG_UNUSED (reg:CC 7 FLAG)
            (nil))))
(insn 10 9 12 3 (parallel [
            (set (reg:QI 17 [ <retval> ])
                (and:QI (reg:QI 17 [ <retval> ])
                    (const_int 1 [0x1])))
            (clobber (reg:CC 7 FLAG))
        ]) "fail.c":22:26 10 {andqi3}
     (expr_list:REG_UNUSED (reg:CC 7 FLAG)
        (nil)))
(jump_insn 12 10 20 3 (parallel [
            (set (pc)
                (if_then_else (eq (reg:QI 17 [ <retval> ])
                        (const_int 0 [0]))
                    (label_ref:QI 11)
                    (pc)))
            (clobber (scratch:QI))
        ]) "fail.c":22:9 41 {*cbranchqi4_im}
     (int_list:REG_BR_PROB 955630228 (nil))
 -> 11)

Which when combined generates patterns that will produce invalid code. GCC
has assumed the memory value will not change between insn 10 & 12.
(insn 10 9 12 3 (parallel [
            (set (reg:QI 17 [ <retval> ])
                (and:QI (lshiftrt:QI (mem/v:QI (const_int 5 [0x5]) [1
MEM[(volatile struct register_t *)5B]+0 S1 A16])
                        (const_int 3 [0x3]))
                    (const_int 1 [0x1])))
            (clobber (reg:CC 7 FLAG))
        ]) "fail.c":22:18 10 {andqi3}
     (expr_list:REG_UNUSED (reg:CC 7 FLAG)
        (nil)))
(jump_insn 12 10 20 3 (parallel [
            (set (pc)
                (if_then_else (eq (zero_extract:QI (mem/v:QI (const_int 5
[0x5]) [1 MEM[(volatile struct register_t *)5B]+0 S1 A16])
                            (const_int 1 [0x1])
                            (const_int 3 [0x3]))
                        (const_int 0 [0]))
                    (label_ref:QI 11)
                    (pc)))
            (clobber (scratch:QI))
        ]) "fail.c":22:9 43 {*cbranchqi4_zero_extract}
     (int_list:REG_BR_PROB 955630228 (nil))
 -> 11)

With the above patch, the combination of insn 10 & 12 is rejected and the
if_then_else switches on retval.
(insn 10 9 12 3 (parallel [
            (set (reg:QI 17 [ <retval> ])
                (and:QI (lshiftrt:QI (mem/v:QI (const_int 5 [0x5]) [1
MEM[(volatile struct register_t *)5B]+0 S1 A16])
                        (const_int 3 [0x3]))
                    (const_int 1 [0x1])))
            (clobber (reg:CC 7 FLAG))
        ]) "fail.c":22:18 10 {andqi3}
     (expr_list:REG_UNUSED (reg:CC 7 FLAG)
        (nil)))
(jump_insn 12 10 20 3 (parallel [
            (set (pc)
                (if_then_else (eq (reg:QI 17 [ <retval> ])
                        (const_int 0 [0]))
                    (label_ref:QI 11)
                    (pc)))
            (clobber (scratch:QI))
        ]) "fail.c":22:9 41 {*cbranchqi4_im}
     (int_list:REG_BR_PROB 955630228 (nil))
 -> 11)

Issues like this are probably a reason volatile memory is disallowed during
combine.

I am open to suggestions on how else to resolve the issue.

Thanks,
Tucker

Reply via email to