On Tue, 7 Dec 2021 at 19:08, Richard Sandiford <richard.sandif...@arm.com> wrote: > > Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes: > > On Thu, 2 Dec 2021 at 23:11, Richard Sandiford > > <richard.sandif...@arm.com> wrote: > >> > >> Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes: > >> > Hi Richard, > >> > I have attached a WIP untested patch for PR96463. > >> > IIUC, the PR suggests to transform > >> > lhs = svld1rq ({-1, -1, ...}, &v[0]) > >> > into: > >> > lhs = vec_perm_expr<v, v, {0, 0, ...}> > >> > if v is vector of 4 elements, and each element is 32 bits on little > >> > endian target ? > >> > > >> > I am sorry if this sounds like a silly question, but I am not sure how > >> > to convert a vector of type int32x4_t into svint32_t ? In the patch, I > >> > simply used NOP_EXPR (which I expected to fail), and gave type error > >> > during gimple verification: > >> > >> It should be possible in principle to have a VEC_PERM_EXPR in which > >> the operands are Advanced SIMD vectors and the result is an SVE vector. > >> > >> E.g., the dup in the PR would be something like this: > >> > >> foo (int32x4_t a) > >> { > >> svint32_t _2; > >> > >> _2 = VEC_PERM_EXPR <x_1(D), x_1(D), { 0, 1, 2, 3, 0, 1, 2, 3, ... }>; > >> return _2; > >> } > >> > >> where the final operand can be built using: > >> > >> int source_nelts = TYPE_VECTOR_SUBPARTS (…rhs type…).to_constant (); > >> vec_perm_builder sel (TYPE_VECTOR_SUBPARTS (…lhs type…), source_nelts, > >> 1); > >> for (int i = 0; i < source_nelts; ++i) > >> sel.quick_push (i); > >> > >> I'm not sure how well-tested that combination is though. It might need > >> changes to target-independent code. > > Hi Richard, > > Thanks for the suggestions. > > I tried the above approach in attached patch, but it still results in > > ICE due to type mismatch: > > > > pr96463.c: In function ‘foo’: > > pr96463.c:8:1: error: type mismatch in ‘vec_perm_expr’ > > 8 | } > > | ^ > > svint32_t > > int32x4_t > > int32x4_t > > svint32_t > > _3 = VEC_PERM_EXPR <x_4(D), x_4(D), { 0, 1, 2, 3, ... }>; > > during GIMPLE pass: ccp > > dump file: pr96463.c.032t.ccp1 > > pr96463.c:8:1: internal compiler error: verify_gimple failed > > > > Should we perhaps add another tree code, that "extends" a fixed-width > > vector into it's VLA equivalent ? > > No, I think this is just an extreme example of the combination not being > well-tested. :-) Obviously it's worse than I thought. > > I think accepting this kind of VEC_PERM_EXPR is still the way to go. > Richi, WDYT? Hi Richi, ping ?
Thanks, Prathamesh > > Thanks, > Richard