On Sun, Apr 23, 2023 at 10:14 PM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> This patch fixes PR rtl-optimization/109476, which is a code quality
> regression affecting AVR.  The cause is that the lower-subreg pass is
> sometimes overly aggressive, lowering the LSHIFTRT below:
>
> (insn 7 4 8 2 (set (reg:HI 51)
>         (lshiftrt:HI (reg/v:HI 49 [ b ])
>             (const_int 8 [0x8]))) "t.ii":4:36 557 {lshrhi3}
>      (nil))
>
> into a pair of QImode SUBREG assignments:
>
> (insn 19 4 20 2 (set (subreg:QI (reg:HI 51) 0)
>         (reg:QI 54 [ b+1 ])) "t.ii":4:36 86 {movqi_insn_split}
>      (nil))
> (insn 20 19 8 2 (set (subreg:QI (reg:HI 51) 1)
>         (const_int 0 [0])) "t.ii":4:36 86 {movqi_insn_split}
>      (nil))
>
> but this idiom, SETs of SUBREGs, interferes with combine's ability
> to associate/fuse instructions.  The solution, on targets that
> have a suitable ZERO_EXTEND (i.e. where the lower-subreg pass
> wouldn't itself split a ZERO_EXTEND, so "splitting_zext" is false),
> is to split/lower LSHIFTRT to a ZERO_EXTEND.
>
> To answer Richard's question in comment #10 of the bugzilla PR,
> the function resolve_shift_zext is called with one of four RTX
> codes, ASHIFTRT, LSHIFTRT, ZERO_EXTEND and ASHIFT, but only with
> LSHIFTRT can the setting of low_part and high_part SUBREGs be
> replaced by a ZERO_EXTEND.  For ASHIFTRT, we require a sign
> extension, so don't set the high_part to zero; if we're splitting
> a ZERO_EXTEND then it doesn't make sense to replace it with a
> ZERO_EXTEND, and for ASHIFT we've played games to swap the
> high_part and low_part SUBREGs, so that we assign the low_part
> to zero (for double word shifts by greater than word size bits).
>
> This patch has been tested on x86_64-pc-linux-gnu with a make
> bootstrap and make -k check, both 64-bit and 32-bit, with no
> new regressions.  Many thanks to Jeff Law for testing this patch
> on his build farm, which spotted an issue on xstormy16, which
> should now be fixed by (either of) my recent xstormy16 patches.
> Ok for mainline?

OK.

Thanks,
Richard.

>
> 2023-04-23  Roger Sayle  <ro...@nextmovesoftware.com>
>
> gcc/ChangeLog
>         PR rtl-optimization/109476
>         * lower-subreg.cc: Include explow.h for force_reg.
>         (find_decomposable_shift_zext): Pass an additional SPEED_P argument.
>         If decomposing a suitable LSHIFTRT and we're not splitting
>         ZERO_EXTEND (based on the current SPEED_P), then use a ZERO_EXTEND
>         instead of setting a high part SUBREG to zero, which helps combine.
>         (decompose_multiword_subregs): Update call to resolve_shift_zext.
>
> gcc/testsuite/ChangeLog
>         PR rtl-optimization/109476
>         * gcc.target/avr/mmcu/pr109476.c: New test case.
>
>
> Thanks in advance,
> Roger
> --
>

Reply via email to