On 11/27/23 13:52, Jakub Jelinek wrote:
On Mon, Oct 16, 2023 at 01:11:01PM -0600, Jeff Law wrote:
gcc/ChangeLog:

        * Makefile.in: Add fold-mem-offsets.o.
        * passes.def: Schedule a new pass.
        * tree-pass.h (make_pass_fold_mem_offsets): Declare.
        * common.opt: New options.
        * doc/invoke.texi: Document new option.
        * fold-mem-offsets.cc: New file.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/fold-mem-offsets-1.c: New test.
        * gcc.target/riscv/fold-mem-offsets-2.c: New test.
        * gcc.target/riscv/fold-mem-offsets-3.c: New test.
Thanks, I've pushed this to the trunk.

This breaks profiledbootstrap on powerpc64le-linux.
 From what I can see, the pass works one basic block at a time and
will punt on any non-DEBUG_INSN uses outside of the current block
Correct. If a non-debug use is seen outside the block, the entire chain should become unoptimizable as we don't know how to fixup uses outside the block.

(I believe because of the
           /* This use affects instructions outside of CAN_FOLD_INSNS.  */
           if (!bitmap_bit_p (&can_fold_insns, INSN_UID (use)))
             return 0;
test and can_fold_insns only set in do_analysis (when processing insns in
current bb, cleared at the end) or results of get_single_def_in_bb
(which are checked to be in the same bb).
But, while get_single_def_in_bb checks for
   if (DF_INSN_LUID (def) > DF_INSN_LUID (insn))
     return NULL;
(OT, why not DF_INSN_INFO_LUID (DF_REF_INSN_INFO (ref_chain->ref))
instead of DF_INSN_LUID (def), then it doesn't need to look up
DF_INSN_INFO_GET (def)?), nothing when walking all uses of def does such
luid check.
On the OT question, I don't think there's any reason why we couldn't use the other form. We're just looking at the LUIDs to validate order within a block.


The basic block in the PR in question has:
...
(insn 212 210 215 25 (set (mem/f:DI (reg/v/f:DI 10 10 [orig:152 last_viable ] 
[152]) [2 *last_viable_336+0 S8 A64])
         (reg/f:DI 9 9 [orig:155 _342 ] [155])) "pr111601.ii":50:17 683 
{*movdi_internal64}
      (expr_list:REG_DEAD (reg/v/f:DI 10 10 [orig:152 last_viable ] [152])
         (nil)))
(insn 215 212 484 25 (set (reg:DI 5 5 [226])
         (const_int 0 [0])) "pr111601.ii":52:12 683 {*movdi_internal64}
      (expr_list:REG_EQUIV (const_int 0 [0])
         (nil)))
(insn 484 215 218 25 (set (reg/v/f:DI 10 10 [orig:152 last_viable ] [152])
         (reg/f:DI 9 9 [orig:155 _342 ] [155])) "pr111601.ii":52:12 683 
{*movdi_internal64}
      (nil))
...
(insn 564 214 216 25 (set (reg/v/f:DI 10 10 [orig:152 last_viable ] [152])
         (plus:DI (reg/v/f:DI 10 10 [orig:152 last_viable ] [152])
             (const_int 96 [0x60]))) "pr111601.ii":52:12 66 {*adddi3}
      (nil))
(insn 216 564 219 25 (set (mem/f:DI (reg/v/f:DI 10 10 [orig:152 last_viable ] 
[152]) [2 _343->next+0 S8 A64])
         (reg:DI 5 5 [226])) "pr111601.ii":52:12 683 {*movdi_internal64}
      (expr_list:REG_DEAD (reg:DI 5 5 [226])
         (nil)))
...
and when asking for all uses of %r10 from def 564, it will see uses
in 216 and 212; the former is after the += 96 addition and gets changed
to load from %r10+96 with the addition being dropped, but there is
the other store which is a use across the backedge and when reached
from other edges certainly doesn't have the + 96 addition anywhere,
so the pass doesn't actually change that location.


Haven't bootstrapped/regtested this yet, will start momentarily,
posting here just in case I'm missing something important.

2023-11-27  Jakub Jelinek  <ja...@redhat.com>

        PR bootstrap/111601
        * fold-mem-offsets.cc (fold_offsets): Punt if use appears before
        def in the basic block.

        * g++.dg/opt/pr111601.C: New test.
It looks sensible to me. When the use is before the set we're either dealing with undefined scenarios or a backedge. Either way I think punting is going to be appropriate.

As you noted downthread that didn't actually work due to the artificial uses problem. But conceptually I think you're on the right track and we just need to sort out the implementation details.

jeff

Reply via email to