On Thu, Mar 12, 2020 at 2:38 AM Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > On Thu, Mar 12, 2020 at 4:06 AM Jeff Law via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > > > On Wed, 2020-03-11 at 13:04 +0000, Nidal Faour via Gcc-patches wrote: > > > This patch is a code density oriented and attempt to remove redundant > > > sign/zero > > > extension from assignment statement. > > > The approach taken is to use VRP data while expanding the assignment to > > > RTL to > > > determine whether a sign/zero extension is necessary. > > > Thought the motivation of the patch is code density but it also good for > > > speed. > > > > > > for example: > > > extern unsigned int func (); > > > > > > unsigned char > > > foo (unsigned int arg) > > > { > > > if (arg == 2) > > > return 0; > > > > > > return (func() == arg || arg == 7); > > > } > > >
When we reach combine, we have if func result == arg r73 = 1 else r73 = arg-7 == 0 r75 = zero_extend subreg:QI r73 On this platform a scc operation always produces 0 or 1, so the zero_extend is redundant. We would need something pushing the zero extend back into the if_then_else and then noticing that it is redundant on both branches, or something propagating the value ranges forward. We almost have the latter bit in combine. Right before we start combining instructions, when we set nonzero_sign_valid to 1, we have (gdb) print reg_stat[73] $1 = (reg_stat_type &) @0x28ad908: {last_death = 0x7ffff5de0500, last_set = 0x7ffff5de02c0, last_set_value = 0x7ffff5ddc490, last_set_table_tick = 6, last_set_label = 5, last_set_nonzero_bits = 1, last_set_sign_bit_copies = 31 '\037', last_set_mode = E_SImode, last_set_invalid = 0 '\000', sign_bit_copies = 31 '\037', nonzero_bits = 1, truncation_label = 0, truncated_to_mode = E_VOIDmode} (gdb) so we know that r73 has 31 valid sign bits and only one bit is nonzero. This info is still valid when we reach the zero extend insn. Unfortunately, we don't use this info to do anything useful. The zero_extend is the only insn in its basic block (not counting the unconditional branch that ends the block), so it doesn't get combined with anything, and hence doesn't get simplified. If it did get combined with something, then expand_compound_operation would eliminate the zero_extend and subreg. That makes me wonder if there is any value in trying to handle single-instruction combinations, just to get the simplifications we can get from the nonzero_bits info, but it isn't obvious how we would detect when a single insn combine is better than the original insn. Maybe rtx cost info can be used for that. I looked at combine because I'm familiar with that pass, but the ree pass might be the right place to handle this. I don't know if it has any support for handling if statements. If not, maybe it could be extended to handle cases like this. Jim