Hi Richard, Thanks for getting back to me and your insight. I've implemented a TARGET_CANNOT_COPY_INSN_P that rejects volatile memory and it appears to be doing the trick!
It will take some time to determine if there are any other side effects but I can move forward with this. Much appreciated! -Tucker On Tue, Oct 13, 2020 at 3:00 AM Richard Sandiford <richard.sandif...@arm.com> wrote: > Tucker Kern via Gcc <gcc@gcc.gnu.org> writes: > > TL;DR > > > > In GCC 9.3, I believe modified_between_p should return 1 if the memory > > reference is volatile. My layman's understanding of volatile memory is > that > > it could change at any time, and thus could be modified between any two > > instructions. > > That's true, but in RTL, volatile accesses provide no timing guarantee, > even in an abstract sense. E.g. suppose we traced the execution of a > function call based on the instructions that exist immediately after > the expand pass. If a volatile access occurs in the Nth instruction > of this trace, there's no guarantee that the access will still occur > in the Nth instruction of traces for later passes. Unrelated > instructions can be added and removed at any time. > > For example, a volatile access should not stop us from swapping: > > (set (reg:SI R0) (const_int 0)) > > and: > > (set (mem/v:SI (reg:SI R1)) (const_int 0)) > > So… > > > Possible patch: > > diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c > > index 01af063a222..a395df0627b 100644 > > --- a/gcc/rtlanal.c > > +++ b/gcc/rtlanal.c > > @@ -1308,6 +1308,8 @@ modified_between_p (const_rtx x, const rtx_insn > > *start, const rtx_insn *end) > > return 1; > > > > case MEM: > > + if (MEM_VOLATILE_P(x)) > > + return 1; > > if (modified_between_p (XEXP (x, 0), start, end)) > > return 1; > > if (MEM_READONLY_P (x)) > > …I think the current code handles volatile correctly. The issue > is whether any instruction in (START, END) might modify the same > memory as X. Most of the real work is done by the alias machinery, > which understands (or is supposed to understand) the special rules > around volatile memory and how it interacts with other memory accesses: > > for (insn = NEXT_INSN (start); insn != end; insn = NEXT_INSN (insn)) > if (memory_modified_in_insn_p (x, insn)) > return 1; > > > Details > > I'm writing a splitter to optimize bit field conditionals (e.g. testing a > > bit of memory mapped I/O). The non-optimized pattern consists of a SET, > > AND, LSHIFTRT + CMP, JMP pattern which I know can be simplified to an > > AND + JMP as the AND operation will set the necessary CC bits. To achieve > > this optimization in the combine pass a custom general_operand has been > > defined which allows volatile memory. > > > > However in certain situations I've found the combine pass is generating > > invalid patterns when it merges a SET and IF_THEN_ELSE (CMP + JMP) > > patterns. One such situation is when GCC decides to reuse the result of > the > > comparison for the return value. > > > > This issue seems intertwined with the combine + volatile_ok issue that > has > > been discussed a number of times in the past. The backend is question is > > highly embedded and significant performance gains are made by enabling > > volatile references during combine. > > > > Optimized tree of the test program. In this program the bit field value > is > > saved in _1 to be reused in the return value. Why it makes this choice is > > beyond me. > > <unnamed-unsigned:1> _1; > > _Bool _3; > > > > <bb 3> [local count: 1044213930]: > > _1 ={v} MEM[(volatile struct register_t *)5B].done; > > if (_1 != 0) > > goto <bb 4>; [11.00%] > > else > > goto <bb 3>; [89.00%] > > > > <bb 4> [local count: 1073741824]: > > // Additional code trimmed for brevity... > > > > <bb 6> [local count: 114863532]: > > # _3 = PHI <0(4), _1(5)> > > return _3; > > > > This produces the following pre-combine RTL. > > (insn 9 6 10 3 (parallel [ > > (set (reg:QI 17 [ <retval> ]) > > (lshiftrt:QI (mem/v:QI (const_int 5 [0x5]) [1 > MEM[(volatile > > struct register_t *)5B]+0 S1 A16]) > > (const_int 3 [0x3]))) > > (clobber (reg:CC 7 FLAG)) > > ]) "fail.c":22:26 19 {*lshrqi3_im} > > (expr_list:REG_DEAD (reg/f:QI 18) > > (expr_list:REG_UNUSED (reg:CC 7 FLAG) > > (nil)))) > > (insn 10 9 12 3 (parallel [ > > (set (reg:QI 17 [ <retval> ]) > > (and:QI (reg:QI 17 [ <retval> ]) > > (const_int 1 [0x1]))) > > (clobber (reg:CC 7 FLAG)) > > ]) "fail.c":22:26 10 {andqi3} > > (expr_list:REG_UNUSED (reg:CC 7 FLAG) > > (nil))) > > (jump_insn 12 10 20 3 (parallel [ > > (set (pc) > > (if_then_else (eq (reg:QI 17 [ <retval> ]) > > (const_int 0 [0])) > > (label_ref:QI 11) > > (pc))) > > (clobber (scratch:QI)) > > ]) "fail.c":22:9 41 {*cbranchqi4_im} > > (int_list:REG_BR_PROB 955630228 (nil)) > > -> 11) > > > > Which when combined generates patterns that will produce invalid code. > GCC > > has assumed the memory value will not change between insn 10 & 12. > > (insn 10 9 12 3 (parallel [ > > (set (reg:QI 17 [ <retval> ]) > > (and:QI (lshiftrt:QI (mem/v:QI (const_int 5 [0x5]) [1 > > MEM[(volatile struct register_t *)5B]+0 S1 A16]) > > (const_int 3 [0x3])) > > (const_int 1 [0x1]))) > > (clobber (reg:CC 7 FLAG)) > > ]) "fail.c":22:18 10 {andqi3} > > (expr_list:REG_UNUSED (reg:CC 7 FLAG) > > (nil))) > > (jump_insn 12 10 20 3 (parallel [ > > (set (pc) > > (if_then_else (eq (zero_extract:QI (mem/v:QI (const_int 5 > > [0x5]) [1 MEM[(volatile struct register_t *)5B]+0 S1 A16]) > > (const_int 1 [0x1]) > > (const_int 3 [0x3])) > > (const_int 0 [0])) > > (label_ref:QI 11) > > (pc))) > > (clobber (scratch:QI)) > > ]) "fail.c":22:9 43 {*cbranchqi4_zero_extract} > > (int_list:REG_BR_PROB 955630228 (nil)) > > -> 11) > > > > With the above patch, the combination of insn 10 & 12 is rejected and the > > if_then_else switches on retval. > > (insn 10 9 12 3 (parallel [ > > (set (reg:QI 17 [ <retval> ]) > > (and:QI (lshiftrt:QI (mem/v:QI (const_int 5 [0x5]) [1 > > MEM[(volatile struct register_t *)5B]+0 S1 A16]) > > (const_int 3 [0x3])) > > (const_int 1 [0x1]))) > > (clobber (reg:CC 7 FLAG)) > > ]) "fail.c":22:18 10 {andqi3} > > (expr_list:REG_UNUSED (reg:CC 7 FLAG) > > (nil))) > > (jump_insn 12 10 20 3 (parallel [ > > (set (pc) > > (if_then_else (eq (reg:QI 17 [ <retval> ]) > > (const_int 0 [0])) > > (label_ref:QI 11) > > (pc))) > > (clobber (scratch:QI)) > > ]) "fail.c":22:9 41 {*cbranchqi4_im} > > (int_list:REG_BR_PROB 955630228 (nil)) > > -> 11) > > > > Issues like this are probably a reason volatile memory is disallowed > during > > combine. > > Yeah. > > I think the issue in the example above is duplication rather than > modified_between_p. There is: > > /* We are about to copy insns for the case where they need to be kept > around. Check that they can be copied in the merged instruction. */ > > if (targetm.cannot_copy_insn_p > && ((added_sets_2 && targetm.cannot_copy_insn_p (i2)) > || (i1 && added_sets_1 && targetm.cannot_copy_insn_p (i1)) > || (i0 && added_sets_0 && targetm.cannot_copy_insn_p (i0)))) > { > undo_all (); > return 0; > } > > to stop us copying things that shouldn't be copied. Perhaps volatile > accesses should be handled the same way. > > But like you imply, I'm far from confident that this is the only > problem with ignoring volatile_ok. > > Thanks, > Richard >