On Thu, Mar 12, 2020 at 2:38 AM Richard Biener via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Thu, Mar 12, 2020 at 4:06 AM Jeff Law via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > On Wed, 2020-03-11 at 13:04 +0000, Nidal Faour via Gcc-patches wrote:
> > > This patch is a code density oriented and attempt to remove redundant 
> > > sign/zero
> > > extension from assignment statement.
> > > The approach taken is to use VRP data while expanding the assignment to 
> > > RTL to
> > > determine whether a sign/zero extension is necessary.
> > > Thought the motivation of the patch is code density but it also good for 
> > > speed.
> > >
> > > for example:
> > > extern unsigned int func ();
> > >
> > >   unsigned char
> > >   foo (unsigned int arg)
> > >   {
> > >     if (arg == 2)
> > >       return 0;
> > >
> > >     return (func() == arg || arg == 7);
> > >   }
> > >

When we reach combine, we have
  if func result == arg
     r73 = 1
  else
     r73 = arg-7 == 0
  r75 = zero_extend subreg:QI r73

On this platform a scc operation always produces 0 or 1, so the
zero_extend is redundant.  We would need something pushing the zero
extend back into the if_then_else and then noticing that it is
redundant on both branches, or something propagating the value ranges
forward.

We almost have the latter bit in combine.  Right before we start
combining instructions, when we set nonzero_sign_valid to 1, we have

(gdb) print reg_stat[73]
$1 = (reg_stat_type &) @0x28ad908: {last_death = 0x7ffff5de0500,
  last_set = 0x7ffff5de02c0, last_set_value = 0x7ffff5ddc490,
  last_set_table_tick = 6, last_set_label = 5, last_set_nonzero_bits = 1,
  last_set_sign_bit_copies = 31 '\037', last_set_mode = E_SImode,
  last_set_invalid = 0 '\000', sign_bit_copies = 31 '\037',
  nonzero_bits = 1, truncation_label = 0, truncated_to_mode = E_VOIDmode}
(gdb)

so we know that r73 has 31 valid sign bits and only one bit is
nonzero.  This info is still valid when we reach the zero extend insn.
Unfortunately, we don't use this info to do anything useful.  The
zero_extend is the only insn in its basic block (not counting the
unconditional branch that ends the block), so it doesn't get combined
with anything, and hence doesn't get simplified.  If it did get
combined with something, then expand_compound_operation would
eliminate the zero_extend and subreg.  That makes me wonder if there
is any value in trying to handle single-instruction combinations, just
to get the simplifications we can get from the nonzero_bits info, but
it isn't obvious how we would detect when a single insn combine is
better than the original insn.  Maybe rtx cost info can be used for
that.

I looked at combine because I'm familiar with that pass, but the ree
pass might be the right place to handle this.  I don't know if it has
any support for handling if statements.  If not, maybe it could be
extended to handle cases like this.

Jim

Reply via email to