On Tue, Jul 25, 2023 at 1:31 PM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> This patch is the third in series of fixes for PR rtl-optimization/110587,
> a compile-time regression with -O0, that attempts to address the underlying
> cause.  As noted previously, the pathological test case pr28071.c contains
> a large number of useless register-to-register moves that can produce
> quadratic behaviour (in LRA).  These move are generated during RTL
> expansion in emit_group_load_1, where the middle-end attempts to simplify
> the source before calling extract_bit_field.  This is reasonable if the
> source is a complex expression (from before the tree-ssa optimizers), or
> a SUBREG, or a hard register, but it's not particularly useful to copy
> a pseudo register into a new pseudo register.  This patch eliminates that
> redundancy.
>
> The -fdump-tree-expand for pr28071.c compiled with -O0 currently contains
> 777K lines, with this patch it contains 717K lines, i.e. saving about 60K
> lines (admittedly of debugging text output, but it makes the point).
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures.  Ok for mainline?
>
> As always, I'm happy to revert this change quickly if there's a problem,
> and investigate why this additional copy might (still) be needed on other
> non-x86 targets.

@@ -2622,6 +2622,7 @@ emit_group_load_1 (rtx *tmps, rtx dst, rtx
orig_src, tree type,
         be loaded directly into the destination.  */
       src = orig_src;
       if (!MEM_P (orig_src)
+         && (!REG_P (orig_src) || HARD_REGISTER_P (orig_src))
          && (!CONSTANT_P (orig_src)
              || (GET_MODE (orig_src) != mode
                  && GET_MODE (orig_src) != VOIDmode)))

so that means the code guarded by the conditional could instead
be transformed to

   src = force_reg (mode, orig_src);

?  Btw, the || (GET_MODE (orig_src) != mode && GET_MODE (orig_src) != VOIDmode)
case looks odd as in that case we'd use GET_MODE (orig_src) for the
move ... that
might also mean we have to use force_reg (GET_MODE (orig_src) ==
VOIDmode ? mode : GET_MODE (orig_src), orig_src))

Otherwise I think this is OK, as said, using
force_reg somehow would improve readability here I think.

I also wonder how the

      else if (GET_CODE (src) == CONCAT)

case will ever trigger with the current code.

Richard.

>
> 2023-07-25  Roger Sayle  <ro...@nextmovesoftware.com>
>
> gcc/ChangeLog
>         PR middle-end/28071
>         PR rtl-optimization/110587
>         * expr.cc (emit_group_load_1): Avoid copying a pseudo register into
>         a new pseudo register, i.e. only copy hard regs into a new pseudo.
>
>
> Thanks in advance,
> Roger
> --
>

Reply via email to