RE: [PATCH] aarch64: Don't generate invalid zero/sign-extend syntax

Alex Coplan Wed, 19 Aug 2020 01:13:00 -0700

Hi Richard,

> -----Original Message-----
> From: Richard Sandiford <richard.sandif...@arm.com>
> Sent: 18 August 2020 09:35
> To: Alex Coplan <alex.cop...@arm.com>
> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw <richard.earns...@arm.com>;
> Marcus Shawcroft <marcus.shawcr...@arm.com>; Kyrylo Tkachov
> <kyrylo.tkac...@arm.com>
> Subject: Re: [PATCH] aarch64: Don't generate invalid zero/sign-extend
> syntax
> 
> Alex Coplan <alex.cop...@arm.com> writes:
> > Note that an obvious omission here is that this patch does not touch
> the
> > mult patterns such as *add_<optab><ALLX:mode>_mult_<GPI:mode>. I found
> > that I couldn't hit these patterns with C code since multiplications by
> > powers of two always get turned into shifts by earlier RTL passes. If
> > there's a way to reliably hit these patterns, then perhaps these should
> > be updated as well.
> 
> Hmm.  Feels like we should either update them or delete them.  E.g.:
> 
>   *adds_<optab><mode>_multp2
>   *subs_<optab><mode>_multp2
> 
> were added alongside the adds3.c and subs3.c tests that you're updating,
> so if the tests don't/no longer need the multp2 patterns to pass,
> there's a good chance that the patterns are redundant.
> 
> For reasons I never understood, the canonical representation is to use
> (mult …) for powers of 2 inside a (mem …) but shifts outside of (mem …)s.
> So perhaps the patterns were originally for address calculations that had
> been moved outside of a (mem …) and not updated to shifts instead of
> mults.


Thanks for the review, and for clarifying this. I tried removing these
together with the *_mul_imm_ patterns (e.g. *adds_mul_imm_<mode>) and
the only failure was gcc/testsuite/gcc.dg/torture/pr34330.c when
compiled with -Os -ftree-vectorize which appears to depend on the
*add_mul_imm_di pattern. Without this pattern, we ICE in LRA on this
input.

In this case, GCC appears to have done exactly what you described: we
have the (plus (mult ...) ...) nodes inside (mem)s prior to register
allocation, and then we end up pulling these out without converting them
to shifts.

Seeing this behaviour (and in particular seeing the ICE) makes me
hesitant to just go ahead and remove the other patterns. That said, I
have a patch to remove the following patterns:

 *adds_<optab><mode>_multp2
 *subs_<optab><mode>_multp2
 *add_<optab><ALLX:mode>_mult_<GPI:mode>
 *add_<optab><SHORT:mode>_mult_si_uxtw
 *add_<optab><mode>_multp2
 *add_<optab>si_multp2_uxtw
 *add_uxt<mode>_multp2
 *add_uxtsi_multp2_uxtw
 *sub_<optab><mode>_multp2
 *sub_<optab>si_multp2_uxtw
 *sub_uxt<mode>_multp2
 *sub_uxtsi_multp2_uxtw
 *sub_uxtsi_multp2_uxtw

(together with the predicate aarch64_pwr_imm3 which is only used in
these patterns) and this bootstraps/regtests just fine.

So, I have a couple of questions:

(1) Should it be considered a bug if we pull (plus (mult (power of 2)
   ...) ...) out of a (mem) RTX without re-writing the (mult) as a
   shift?
(2) If not, can we otherwise justify the removal of the patterns here?

I'm happy to go ahead with this if either (1) or (2) are true.

Thanks,
Alex

RE: [PATCH] aarch64: Don't generate invalid zero/sign-extend syntax

Reply via email to