On Tue, Jul 25, 2023 at 1:31 PM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > This patch is the third in series of fixes for PR rtl-optimization/110587, > a compile-time regression with -O0, that attempts to address the underlying > cause. As noted previously, the pathological test case pr28071.c contains > a large number of useless register-to-register moves that can produce > quadratic behaviour (in LRA). These move are generated during RTL > expansion in emit_group_load_1, where the middle-end attempts to simplify > the source before calling extract_bit_field. This is reasonable if the > source is a complex expression (from before the tree-ssa optimizers), or > a SUBREG, or a hard register, but it's not particularly useful to copy > a pseudo register into a new pseudo register. This patch eliminates that > redundancy. > > The -fdump-tree-expand for pr28071.c compiled with -O0 currently contains > 777K lines, with this patch it contains 717K lines, i.e. saving about 60K > lines (admittedly of debugging text output, but it makes the point). > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32} > with no new failures. Ok for mainline? > > As always, I'm happy to revert this change quickly if there's a problem, > and investigate why this additional copy might (still) be needed on other > non-x86 targets.
@@ -2622,6 +2622,7 @@ emit_group_load_1 (rtx *tmps, rtx dst, rtx orig_src, tree type, be loaded directly into the destination. */ src = orig_src; if (!MEM_P (orig_src) + && (!REG_P (orig_src) || HARD_REGISTER_P (orig_src)) && (!CONSTANT_P (orig_src) || (GET_MODE (orig_src) != mode && GET_MODE (orig_src) != VOIDmode))) so that means the code guarded by the conditional could instead be transformed to src = force_reg (mode, orig_src); ? Btw, the || (GET_MODE (orig_src) != mode && GET_MODE (orig_src) != VOIDmode) case looks odd as in that case we'd use GET_MODE (orig_src) for the move ... that might also mean we have to use force_reg (GET_MODE (orig_src) == VOIDmode ? mode : GET_MODE (orig_src), orig_src)) Otherwise I think this is OK, as said, using force_reg somehow would improve readability here I think. I also wonder how the else if (GET_CODE (src) == CONCAT) case will ever trigger with the current code. Richard. > > 2023-07-25 Roger Sayle <ro...@nextmovesoftware.com> > > gcc/ChangeLog > PR middle-end/28071 > PR rtl-optimization/110587 > * expr.cc (emit_group_load_1): Avoid copying a pseudo register into > a new pseudo register, i.e. only copy hard regs into a new pseudo. > > > Thanks in advance, > Roger > -- >