Re: [SVE] PR96463 - Optimise svld1rq from vectors

Richard Sandiford via Gcc-patches Tue, 07 Dec 2021 05:39:18 -0800

Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes:
> On Thu, 2 Dec 2021 at 23:11, Richard Sandiford
> <richard.sandif...@arm.com> wrote:
>>
>> Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes:
>> > Hi Richard,
>> > I have attached a WIP untested patch for PR96463.
>> > IIUC, the PR suggests to transform
>> > lhs = svld1rq ({-1, -1, ...}, &v[0])
>> > into:
>> > lhs = vec_perm_expr<v, v, {0, 0, ...}>
>> > if v is vector of 4 elements, and each element is 32 bits on little
>> > endian target ?
>> >
>> > I am sorry if this sounds like a silly question, but I am not sure how
>> > to convert a vector of type int32x4_t into svint32_t ? In the patch, I
>> > simply used NOP_EXPR (which I expected to fail), and gave type error
>> > during gimple verification:
>>
>> It should be possible in principle to have a VEC_PERM_EXPR in which
>> the operands are Advanced SIMD vectors and the result is an SVE vector.
>>
>> E.g., the dup in the PR would be something like this:
>>
>> foo (int32x4_t a)
>> {
>>   svint32_t _2;
>>
>>   _2 = VEC_PERM_EXPR <x_1(D), x_1(D), { 0, 1, 2, 3, 0, 1, 2, 3, ... }>;
>>   return _2;
>> }
>>
>> where the final operand can be built using:
>>
>>   int source_nelts = TYPE_VECTOR_SUBPARTS (…rhs type…).to_constant ();
>>   vec_perm_builder sel (TYPE_VECTOR_SUBPARTS (…lhs type…), source_nelts, 1);
>>   for (int i = 0; i < source_nelts; ++i)
>>     sel.quick_push (i);
>>
>> I'm not sure how well-tested that combination is though.  It might need
>> changes to target-independent code.
> Hi Richard,
> Thanks for the suggestions.
> I tried the above approach in attached patch, but it still results in
> ICE due to type mismatch:
>
> pr96463.c: In function ‘foo’:
> pr96463.c:8:1: error: type mismatch in ‘vec_perm_expr’
>     8 | }
>       | ^
> svint32_t
> int32x4_t
> int32x4_t
> svint32_t
> _3 = VEC_PERM_EXPR <x_4(D), x_4(D), { 0, 1, 2, 3, ... }>;
> during GIMPLE pass: ccp
> dump file: pr96463.c.032t.ccp1
> pr96463.c:8:1: internal compiler error: verify_gimple failed
>
> Should we perhaps add another tree code, that "extends" a fixed-width
> vector into it's VLA equivalent ?


No, I think this is just an extreme example of the combination not being
well-tested. :-)  Obviously it's worse than I thought.

I think accepting this kind of VEC_PERM_EXPR is still the way to go.
Richi, WDYT?

Thanks,
Richard

Re: [SVE] PR96463 - Optimise svld1rq from vectors

Reply via email to