Re: [SVE] PR96463 - Optimise svld1rq from vectors

Prathamesh Kulkarni via Gcc-patches Tue, 14 Dec 2021 00:35:07 -0800

On Tue, 7 Dec 2021 at 19:08, Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes:
> > On Thu, 2 Dec 2021 at 23:11, Richard Sandiford
> > <richard.sandif...@arm.com> wrote:
> >>
> >> Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes:
> >> > Hi Richard,
> >> > I have attached a WIP untested patch for PR96463.
> >> > IIUC, the PR suggests to transform
> >> > lhs = svld1rq ({-1, -1, ...}, &v[0])
> >> > into:
> >> > lhs = vec_perm_expr<v, v, {0, 0, ...}>
> >> > if v is vector of 4 elements, and each element is 32 bits on little
> >> > endian target ?
> >> >
> >> > I am sorry if this sounds like a silly question, but I am not sure how
> >> > to convert a vector of type int32x4_t into svint32_t ? In the patch, I
> >> > simply used NOP_EXPR (which I expected to fail), and gave type error
> >> > during gimple verification:
> >>
> >> It should be possible in principle to have a VEC_PERM_EXPR in which
> >> the operands are Advanced SIMD vectors and the result is an SVE vector.
> >>
> >> E.g., the dup in the PR would be something like this:
> >>
> >> foo (int32x4_t a)
> >> {
> >>   svint32_t _2;
> >>
> >>   _2 = VEC_PERM_EXPR <x_1(D), x_1(D), { 0, 1, 2, 3, 0, 1, 2, 3, ... }>;
> >>   return _2;
> >> }
> >>
> >> where the final operand can be built using:
> >>
> >>   int source_nelts = TYPE_VECTOR_SUBPARTS (…rhs type…).to_constant ();
> >>   vec_perm_builder sel (TYPE_VECTOR_SUBPARTS (…lhs type…), source_nelts, 
> >> 1);
> >>   for (int i = 0; i < source_nelts; ++i)
> >>     sel.quick_push (i);
> >>
> >> I'm not sure how well-tested that combination is though.  It might need
> >> changes to target-independent code.
> > Hi Richard,
> > Thanks for the suggestions.
> > I tried the above approach in attached patch, but it still results in
> > ICE due to type mismatch:
> >
> > pr96463.c: In function ‘foo’:
> > pr96463.c:8:1: error: type mismatch in ‘vec_perm_expr’
> >     8 | }
> >       | ^
> > svint32_t
> > int32x4_t
> > int32x4_t
> > svint32_t
> > _3 = VEC_PERM_EXPR <x_4(D), x_4(D), { 0, 1, 2, 3, ... }>;
> > during GIMPLE pass: ccp
> > dump file: pr96463.c.032t.ccp1
> > pr96463.c:8:1: internal compiler error: verify_gimple failed
> >
> > Should we perhaps add another tree code, that "extends" a fixed-width
> > vector into it's VLA equivalent ?
>
> No, I think this is just an extreme example of the combination not being
> well-tested. :-)  Obviously it's worse than I thought.
>
> I think accepting this kind of VEC_PERM_EXPR is still the way to go.
> Richi, WDYT?
Hi Richi, ping ?


Thanks,
Prathamesh
>
> Thanks,
> Richard

Re: [SVE] PR96463 - Optimise svld1rq from vectors

Reply via email to