On 8/23/24 6:16 AM, Georg-Johann Lay wrote:
This patch overhauls the avr-ifelse mini-pass that optimizes
two cbranch insns to one comparison and two branches.

More optimization opportunities are realized, and the code
has been refactored.

No new regressions.  Ok for trunk?

There is currently no avr maintainer, so some global reviewer
might please have a look at this.

And one question I have:  avr_optimize_2ifelse() is rewiring
basic blocks using redirect_edge_and_branch().  Does this
require extra pass flags or actions?  Currently the RTL_PASS
data reads:

static const pass_data avr_pass_data_ifelse =
{
   RTL_PASS,      // type
   "",            // name (will be patched)
   OPTGROUP_NONE, // optinfo_flags
   TV_DF_SCAN,    // tv_id
   0,             // properties_required
   0,             // properties_provided
   0,             // properties_destroyed
   0,             // todo_flags_start
   TODO_df_finish | TODO_df_verify // todo_flags_finish
};


Johann

p.s. The additional notes on compare-elim / PR115830 can be found
here (pending review):

https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660743.html

--

AVR: Overhaul the avr-ifelse RTL optimization pass.

Mini-pass avr-ifelse realizes optimizations that replace two cbranch
insns with one comparison and two branches.  This patch adds the
following improvements:

- The right operand of the comparisons may also be REGs.
   Formerly only CONST_INT was handled.

- The code of the first comparison in no more restricted
   to (effectively) EQ.

- When the second cbranch is located in the fallthrough path
   of the first cbranch, then difficult (expensive) comparisons
   can always be avoided.  This may require to swap the branch
   targets.  (When the second cbranch if located after the target
   label of the first one, then getting rid of difficult branches
   would require to reorder blocks.)

- The code has been cleaned up:  avr_rest_of_handle_ifelse() now
   just scans the insn stream for optimization candidates.  The code
   that actually performs the transformation has been outsourced to
   the new function avr_optimize_2ifelse().

- The code to find a better representation for reg-const_int comparisons
   has been split into two parts:  First try to find codes such that the
   right-hand sides of the comparisons are the same (avr_2comparisons_rhs).
   When this succeeds then one comparison can serve two branches, and
   avr_redundant_compare() tries to get rid of difficult branches that
   may have been introduced by avr_2comparisons_rhs().  This is always
   possible when the second cbranch is located in the fallthrough path
   of the first one, or when the first code is EQ.

Some final notes on why we don't use compare-elim:  1) The two cbranch
insns may come with different scratch operands depending on the chosen
constraint alternatives.  There are cases where the outgoing comparison
requires a scratch but only one incoming cbranch has one.  2) Avoiding
difficult branches can be achieved by rewiring basic blocks.
compare-elim doesn't do that; it doesn't even know the costs of the
branch codes.  3)  avr_2comparisons_rhs() may de-canonicalize a
comparison to achieve its goal.  compare-elim doesn't know how to do
that.  4) There are more reasons, see for example the commit
message and discussion for PR115830.

gcc/
     * config/avr/avr.cc (cfganal.h): Include it.
     (avr_2comparisons_rhs, avr_redundant_compare_regs)
     (avr_strict_signed_p, avr_strict_unsigned_p): New static functions.
     (avr_redundant_compare): Overhaul: Allow more cases.
     (avr_optimize_2ifelse): New static function, outsourced from...
     (avr_rest_of_handle_ifelse): ...this method.
gcc/testsuite/
     * gcc.target/avr/torture/ifelse-c.h: New file.
     * gcc.target/avr/torture/ifelse-d.h: New file.
     * gcc.target/avr/torture/ifelse-q.h: New file.
     * gcc.target/avr/torture/ifelse-r.h: New file.
     * gcc.target/avr/torture/ifelse-c-i8.c: New test.
     * gcc.target/avr/torture/ifelse-d-i8.c: New test.
     * gcc.target/avr/torture/ifelse-q-i8.c: New test.
     * gcc.target/avr/torture/ifelse-r-i8.c: New test.
     * gcc.target/avr/torture/ifelse-c-i16.c: New test.
     * gcc.target/avr/torture/ifelse-d-i16.c: New test.
     * gcc.target/avr/torture/ifelse-q-i16.c: New test.
     * gcc.target/avr/torture/ifelse-r-i16.c: New test.
     * gcc.target/avr/torture/ifelse-c-u16.c: New test.
     * gcc.target/avr/torture/ifelse-d-u16.c: New test.
     * gcc.target/avr/torture/ifelse-q-u16.c: New test.
     * gcc.target/avr/torture/ifelse-r-u16.c: New test.

ifelse-tweak.diff

diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc
index c520b98a178..90606b73114 100644
--- a/gcc/config/avr/avr.cc
+++ b/gcc/config/avr/avr.cc


+
+static rtx
+avr_2comparisons_rhs (rtx_code &cond1, rtx xval1,
+                     rtx_code &cond2, rtx xval2, machine_mode mode)
+{
+  HOST_WIDE_INT val1 = INTVAL (xval1);
+  HOST_WIDE_INT val2 = INTVAL (xval2);
+
+  if (val1 == val2)
+    return xval1;
+
+  if (! IN_RANGE (val1 - val2, -2, 2))
+    return NULL_RTX;
+
+  // First, ten exceptional cases that occur near the unsigned boundaries.
+  // All outgoing codes will have at least one EQ or NE.
+  // Similar cases will occur near the signed boundaries, but they are
+  // less common (and even more tedious).
+
+  if (cond1 == EQ && cond2 == EQ)
+    {
+      if (val1 == 1 && val2 == 0)
+       {
+         cond2 = LTU;
+         return xval1;
+       }
+      else if (val1 == 0 && val2 == 1)
+       {
+         cond1 = LTU;
+         return xval2;
+       }
+      else if (val1 == -2 && val2 == -1)
+       {
+         cond2 = GEU;
+         return xval1;
+       }
+      else if (val1 == -1 && val2 == -2)
+       {
+         cond1 = GTU;
+         return xval2;
+       }
+    } // EQ, EQ
[ ... ]
If I'm understanding this code correctly it looks like you're combining those cases into a single branch. None of the code really looks AVR specific, so is there a good reason why we're not doing this in one of the target independent passes?


In fact, I'm a little surprised we're not already handling this stuff.

So while I'm not going to object to the AVR code, we may ultimately be better off taking those testscases and seeing where we should be improving the target independent optimizers.


Jeff





Reply via email to