Re: [PATCH 1/2] Implementation of new RISCV optimizations pass: fold-mem-offsets.

Manolis Tsamis Thu, 15 Jun 2023 10:35:37 -0700

The new target-independant implementation of fold-mem-offsets pass can
be found in the list (link is
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621920.html)


Aside from now being target independent, I have fixed a number of new
bugs that emerged when running this on other targets and a minor
memory leak.
I have also improved the propagation logic in fold_offsets to work
with more patterns found in other targets (e.g. LEA instructions in
x86).
Finally I improved the naming of things (e.g. replaced uses of
'delete'/'remove' with 'fold', made bitmap names more meaningful) and
reduced unnecessary verbosity in some comments.

Thanks,
Manolis

On Tue, Jun 13, 2023 at 12:58 AM Jeff Law <jeffreya...@gmail.com> wrote:
>
>
>
> On 6/12/23 01:32, Manolis Tsamis wrote:
>
> >>
> >>> +  for (ref_link = ref_chain; ref_link; ref_link = ref_link->next)
> >>> +    {
> >>> +      /* Problem getting some definition for this instruction.  */
> >>> +      if (ref_link->ref == NULL)
> >>> +     return NULL;
> >>> +      if (DF_REF_INSN_INFO (ref_link->ref) == NULL)
> >>> +     return NULL;
> >>> +      if (global_regs[REGNO (reg)]
> >>> +       && !set_of (reg, DF_REF_INSN (ref_link->ref)))
> >>> +     return NULL;
> >>> +    }
> >> That last condition feels a bit odd.  It would seem that you wanted an
> >> OR boolean rather than AND.
> >>
> >
> > Most of this function I didn't write by myself, I used existing code
> > to get definitions taken from REE's get_defs.
> > In the original there's a comment about this line this comment that 
> > explains it:
> >
> >    As global regs are assumed to be defined at each function call
> >    dataflow can report a call_insn as being a definition of REG.
> >    But we can't do anything with that in this pass so proceed only
> >    if the instruction really sets REG in a way that can be deduced
> >    from the RTL structure.
> >
> > This function is the only one I copied without changing much (because
> > I didn't quite understand it), so I don't know if that condition is
> > any useful for f-m-o.
> > Also the code duplication here is a bit unfortunate, maybe it would be
> > preferred to create a generic version that can be used in both?
> Ah.  So I think the code is meant to filter out things that DF will say
> are set vs those which are actually exposed explicitly in the RTL (and
> which REE might be able to modify).  So we're probably good.
>
> Those routines are are pretty close to each other in implementation.  I
> bet we could take everything up to the loop over the ref links and
> factor that into a common function.   Both your function and get_defs
> would be able to use that and then do bit of processing afterwards.
>
>
> >
> >>
> >>> +
> >>> +  unsigned int dest_regno = REGNO (dest);
> >>> +
> >>> +  /* We don't want to fold offsets from instructions that change some
> >>> +     particular registers with potentially global side effects.  */
> >>> +  if (!GP_REG_P (dest_regno)
> >>> +      || dest_regno == STACK_POINTER_REGNUM
> >>> +      || (frame_pointer_needed && dest_regno == 
> >>> HARD_FRAME_POINTER_REGNUM)
> >>> +      || dest_regno == GP_REGNUM
> >>> +      || dest_regno == THREAD_POINTER_REGNUM
> >>> +      || dest_regno == RETURN_ADDR_REGNUM)
> >>> +    return 0;
> >> I'd think most of this would be captured by testing fixed_registers
> >> rather than trying to list each register individually.  In fact, if we
> >> need to generalize this to work on other targets we almost certainly
> >> want a more general test.
> >>
> >
> > Thanks, I knew there would be some proper way to test this but wasn't
> > aware which is the correct one.
> > Should this look like below? Or is the GP_REG_P redundant and just
> > fixed_regs will do?
> If you want to verify it's a general register, then you have to ask if
> the regno is in the GENERAL_REGS class.  Something like:
>
> TEST_HARD_REG_BIT (reg_class_contents[GENERAL_REGS], dest_regno)
>
> GP_REG_P is a risc-v specific macro, so we can't use it here.
>
> So something like
>    if (fixed_regs[dest_regno]
>        || !TEST_HARD_REG_BIT (reg_class_contents[GENERAL_REGS], dest_regno))
>
>
>
> >> The debugging message is a bit misleading.  Yea, we always delete
> >> something here, but in one case we end up emitting a copy.
> >>
> >
> > Indeed. Maybe "Instruction reduced to move: ..."?
> Works for me.
>
> >
> >>
> >>
> >>> +
> >>> +       /* Temporarily change the offset in MEM to test whether
> >>> +          it results in a valid instruction.  */
> >>> +       machine_mode mode = GET_MODE (mem_addr);
> >>> +       XEXP (mem, 0) = gen_rtx_PLUS (mode, reg, GEN_INT (offset));
> >>> +
> >>> +       bool valid_change = recog (PATTERN (insn), insn, 0) >= 0;
> >>> +
> >>> +       /* Restore the instruction.  */
> >>> +       XEXP (mem, 0) = mem_addr;
> >> You need to reset the INSN_CODE after restoring the instruction.  That's
> >> generally a bad thing to do, but I've seen it done enough (and been
> >> guilty myself in the past) that we should just assume some ports are
> >> broken in this regard.
> >>
> >
> > Ok thanks, I didn't knew that one.
> It's pretty obscure and I probably would have missed it if I hadn't
> debugged a problem related to this just a few months back.  It shouldn't
> be necessary due to rules about how the movXX patterns are supposed to
> work.  But I've seen it mucked up enough that it's the right thing to do.
>
> Essentially when you call into recog, if the pattern is recognized, then
> a cookie is cached so that we know what pattern was recognized within
> the backend.
>
> As far as generalizing to a target independent pass.  You'll need to
> declare the new pass in tree-pass.h, add it to passes.def, wire up the
> option in common.opt, document it in invoke.texi and turn it on for -O2
> and above.  WRT the interaction with shorten-memrefs, I think that can
> be handled in override-options.  I think we want to have shorten-memrefs
> override.  So if shorten-memrefs is on, then we turn off f-o-m in the RV
> backend.  This should probably be documented in invoke.texi as well.
>
> It sounds like a lot, but I think each of those is a relatively simple
> change.  It'd be best if you could tackle those changes.  I think with
> that done and bootstrap/regression test round we'd be ready to integrate
> your work.
>
> Thanks!
>
> jeff

Re: [PATCH 1/2] Implementation of new RISCV optimizations pass: fold-mem-offsets.

Reply via email to