Hi!

As discussed in the two PRs, it is a bad idea to rely on the reload pass
keeping the IL insufficiently optimized (with noop moves), it regresses
important benchmarks on PowerPC.
The actual reason why on the PR90178 testcase we generated better code is
that we had an almost empty block (containing just (use (reg rax)))
with 3 predecessors and fallthru to exit during cfg cleanup after
the vzeroupper pass.  The mode switching pass which vzeroupper pass uses
under the hood adds an empty extra basic block on the edge from that bb
to exit; as we consider USE insns useless after reload, both basic blocks
were FORWARDER_BLOCK_P and we succeeded in threading two edges into that
almost empty block to the artificial block after it (effectively removing
the USE on those paths), while in the third case it was kept and that
precluded thread_jump from merging the two paths later on.

The following patch adds the reversion^N like has been applied on the
gcc-9-branch (if somebody is willing to test, a later possibility is to
replace that by a fast DCE at the end of LRA like richi has posted in the
PR) and considers USEs of return value as flow_active_insn_p even after
reload.  This predicate is rarely used, mostly for the forwarder testing.

Bootstrapped/regtested on x86_64-linux and i686-linux, additionally tested
that it only affects very small number of stage3 gcc/*.o files and either
doesn't change the generated .text size at all, or on a single *.o file
makes it a few bytes shorter.

Preapproved by Richard on IRC, committed to trunk.

2019-04-29  Jakub Jelinek  <ja...@redhat.com>

        PR rtl-optimization/90257
        * cfgrtl.c (flow_active_insn_p): Return true for USE of a function
        return value.

        Revert the revert:
        2019-04-21  H.J. Lu  <hongjiu...@intel.com>

        PR target/90178
        Revert:
        2018-11-21  Uros Bizjak  <ubiz...@gmail.com>

        Revert the revert:
        2013-10-26  Vladimir Makarov  <vmaka...@redhat.com>

        Revert:
        2013-10-25  Vladimir Makarov  <vmaka...@redhat.com>

        * lra-spills.c (lra_final_code_change): Remove useless move insns.

--- gcc/lra-spills.c.jj 2019-04-27 23:48:49.973528266 +0200
+++ gcc/lra-spills.c    2019-04-29 16:07:58.095983476 +0200
@@ -740,6 +740,7 @@ lra_final_code_change (void)
   int i, hard_regno;
   basic_block bb;
   rtx_insn *insn, *curr;
+  rtx set;
   int max_regno = max_reg_num ();
 
   for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
@@ -818,5 +819,19 @@ lra_final_code_change (void)
              }
          if (insn_change_p)
            lra_update_operator_dups (id);
+
+         if ((set = single_set (insn)) != NULL
+             && REG_P (SET_SRC (set)) && REG_P (SET_DEST (set))
+             && REGNO (SET_SRC (set)) == REGNO (SET_DEST (set)))
+           {
+             /* Remove an useless move insn.  IRA can generate move
+                insns involving pseudos.  It is better remove them
+                earlier to speed up compiler a bit.  It is also
+                better to do it here as they might not pass final RTL
+                check in LRA, (e.g. insn moving a control register
+                into itself).  */
+             lra_invalidate_insn_data (insn);
+             delete_insn (insn);
+           }
        }
 }
--- gcc/cfgrtl.c.jj     2019-04-27 23:48:50.675517128 +0200
+++ gcc/cfgrtl.c        2019-04-29 16:10:34.364519413 +0200
@@ -543,7 +543,7 @@ update_bb_for_insn (basic_block bb)
 }
 
 
-/* Like active_insn_p, except keep the return value clobber around
+/* Like active_insn_p, except keep the return value use or clobber around
    even after reload.  */
 
 static bool
@@ -556,8 +556,12 @@ flow_active_insn_p (const rtx_insn *insn
      programs that fail to return a value.  Its effect is to
      keep the return value from being live across the entire
      function.  If we allow it to be skipped, we introduce the
-     possibility for register lifetime confusion.  */
-  if (GET_CODE (PATTERN (insn)) == CLOBBER
+     possibility for register lifetime confusion.
+     Similarly, keep a USE of the function return value, otherwise
+     the USE is dropped and we could fail to thread jump if USE
+     appears on some paths and not on others, see PR90257.  */
+  if ((GET_CODE (PATTERN (insn)) == CLOBBER
+       || GET_CODE (PATTERN (insn)) == USE)
       && REG_P (XEXP (PATTERN (insn), 0))
       && REG_FUNCTION_VALUE_P (XEXP (PATTERN (insn), 0)))
     return true;

        Jakub

Reply via email to