Re: [1/3 PATCH]AArch64: add support for partial modes to last extractions [PR118464]

Richard Sandiford Wed, 05 Mar 2025 03:29:25 -0800

Tamar Christina <tamar.christ...@arm.com> writes:
>> > diff --git a/gcc/config/aarch64/aarch64-sve.md 
>> > b/gcc/config/aarch64/aarch64-
>> sve.md
>> > index
>> e975286a01904bec0b283b7ba4afde6f0fd60bf1..6c0be3c1a51449274720175b
>> 5e6e7d7535928de6 100644
>> > --- a/gcc/config/aarch64/aarch64-sve.md
>> > +++ b/gcc/config/aarch64/aarch64-sve.md
>> > @@ -3107,7 +3107,7 @@ (define_insn "@extract_<last_op>_<mode>"
>> >    [(set (match_operand:<VEL> 0 "register_operand")
>> >    (unspec:<VEL>
>> >      [(match_operand:<VPRED> 1 "register_operand")
>> > -     (match_operand:SVE_FULL 2 "register_operand")]
>> > +     (match_operand:SVE_ALL 2 "register_operand")]
>> >      LAST))]
>> >    "TARGET_SVE"
>> >    {@ [ cons: =0 , 1   , 2  ]
>> 
>> It looks like this will use (say):
>> 
>>   lasta b<n>, pg, z<m>.b
>> 
>> for VNx4QI, is that right?  I don't think that's safe, since the .b form
>> treats all bits of the pg input as significant, whereas only one in every
>> four bits of pg is defined for VNx4BI (the predicate associated with VNx4QI).
>> 
>> I think converting these patterns to partial vectors means operating
>> on containers rather than elements.  E.g. the VNx4QI case should use
>> .s rather than .b.  That should just be a case of changing vwcore to
>> vccore and Vetype to Vctype, but I haven't looked too deeply.
>
> Ah I see, so for partial types, the values are not expected to be packed in 
> the lower
> part of the vector, but instead are "padded"?


Right.

> That explains some of the other patterns
> I was confused about.
>
> Any ideas how to test these? It's hard to control what modes the vectorizer 
> picks..

Yeah, agreed.  I think it'd be difficult to trigger it reliably from the
vectoriser given its current limited use of the ifns.

A gimple frontend test might work though, with a predicate/mask
generated from (say) 16-bit elements, then bitcast to a predicate/mask
for 32-bit elements and used as an input to an explicit ifn on 32-bit
elements.  If the 16-bit predicate contains 0, 1 for some even-aligned
pair, after the last 1, 0 aligned pair, then the code would likely have
picked the wrong element.

Thanks,
Richard

>
> Thanks,
> Tamar
>
>> 
>> Thanks,
>> Richard
>> 
>> > @@ -8899,7 +8899,7 @@ (define_insn "@fold_extract_<last_op>_<mode>"
>> >    (unspec:<VEL>
>> >      [(match_operand:<VEL> 1 "register_operand")
>> >       (match_operand:<VPRED> 2 "register_operand")
>> > -     (match_operand:SVE_FULL 3 "register_operand")]
>> > +     (match_operand:SVE_ALL 3 "register_operand")]
>> >      CLAST))]
>> >    "TARGET_SVE"
>> >    {@ [ cons: =0 , 1 , 2   , 3  ]
>> > @@ -8909,11 +8909,11 @@ (define_insn "@fold_extract_<last_op>_<mode>"
>> >  )
>> >
>> >  (define_insn "@aarch64_fold_extract_vector_<last_op>_<mode>"
>> > -  [(set (match_operand:SVE_FULL 0 "register_operand")
>> > -  (unspec:SVE_FULL
>> > -    [(match_operand:SVE_FULL 1 "register_operand")
>> > +  [(set (match_operand:SVE_ALL 0 "register_operand")
>> > +  (unspec:SVE_ALL
>> > +    [(match_operand:SVE_ALL 1 "register_operand")
>> >       (match_operand:<VPRED> 2 "register_operand")
>> > -     (match_operand:SVE_FULL 3 "register_operand")]
>> > +     (match_operand:SVE_ALL 3 "register_operand")]
>> >      CLAST))]
>> >    "TARGET_SVE"
>> >    {@ [ cons: =0 , 1 , 2   , 3  ]

Re: [1/3 PATCH]AArch64: add support for partial modes to last extractions [PR118464]

Reply via email to