Re: [PATCH 1/8]middle-end: Recognize scalar reductions from bitfields and array_refs

Jeff Law via Gcc-patches Tue, 22 Nov 2022 06:33:25 -0800


On 11/22/22 04:08, Richard Biener via Gcc-patches wrote:

On Tue, 22 Nov 2022, Richard Sandiford wrote:

Tamar Christina <tamar.christ...@arm.com> writes:

-----Original Message-----
From: Richard Biener <rguent...@suse.de>
Sent: Tuesday, November 22, 2022 10:59 AM
To: Richard Sandiford <richard.sandif...@arm.com>
Cc: Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>; Tamar
Christina <tamar.christ...@arm.com>; Richard Biener
<richard.guent...@gmail.com>; nd <n...@arm.com>
Subject: Re: [PATCH 1/8]middle-end: Recognize scalar reductions from
bitfields and array_refs

On Tue, 22 Nov 2022, Richard Sandiford wrote:

Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org> writes:

So it's not easily possible the within current infrastructure.  But
it does look like ARM might eventually benefit from something like STV

on x86?

I'm not sure.  The problem with trying to do this in RTL is that
you'd have to be able to decide from two psuedos whether they come
from extracts that are sequential. When coming in from a hard
register that's easy yes.  When coming in from a load, or any other

operation that produces psuedos that becomes harder.

Yeah.

Just in case anyone reading the above is tempted to implement STV for
AArch64: I think it would set a bad precedent if we had a
paste-&-adjust version of the x86 pass.  AFAIK, the target
capabilities and constraints are mostly modelled correctly using
existing mechanisms, so I don't think there's anything particularly
target-specific about the process of forcing things to be on the general or

SIMD/FP side.

So if we did have an STV-ish thing for AArch64, I think it should be a
target-independent pass that uses hooks and recog, even if the pass is
initially enabled for AArch64 only.

Agreed - maybe some of the x86 code can be leveraged, but of course the
cost modeling is the most difficult to get right - IIRC the x86 backend resorts
to backend specific tuning flags rather than trying to get rtx_cost or insn_cost
"correct" here.

(FWIW, on the patch itself, I tend to agree that this is really an SLP
optimisation.  If the vectoriser fails to see the benefit, or if it
fails to handle more complex cases, then it would be good to try to
fix that.)

Also agreed - but costing is hard ;)

I guess, I still disagree here but I've clearly been out-Richard.  The problem 
is still
that this is just basic codegen.  I still don't think it requires -O2 to be 
usable.

So I guess the only correct implementation is to use an STV-like patch.  But 
given
that this is already the second attempt, first RTL one was rejected by Richard,
second GIMPLE one was rejected by Richi I'd like to get an agreement on this STV
thing before I waste months more..

I don't think this in itself is a good motivation for STV.  My comment
above was more about the idea of STV for AArch64 in general (since it
had been raised).

Personally I still think the reduction should be generated in gimple.

I agree, and the proper place to generate the reduction is in SLP.

Sorry to have sent things astray with my earlier ACK. It lookedreasonable to me.


jeff

Re: [PATCH 1/8]middle-end: Recognize scalar reductions from bitfields and array_refs

Reply via email to