Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes: > On Thu, 2 Dec 2021 at 23:11, Richard Sandiford > <richard.sandif...@arm.com> wrote: >> >> Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes: >> > Hi Richard, >> > I have attached a WIP untested patch for PR96463. >> > IIUC, the PR suggests to transform >> > lhs = svld1rq ({-1, -1, ...}, &v[0]) >> > into: >> > lhs = vec_perm_expr<v, v, {0, 0, ...}> >> > if v is vector of 4 elements, and each element is 32 bits on little >> > endian target ? >> > >> > I am sorry if this sounds like a silly question, but I am not sure how >> > to convert a vector of type int32x4_t into svint32_t ? In the patch, I >> > simply used NOP_EXPR (which I expected to fail), and gave type error >> > during gimple verification: >> >> It should be possible in principle to have a VEC_PERM_EXPR in which >> the operands are Advanced SIMD vectors and the result is an SVE vector. >> >> E.g., the dup in the PR would be something like this: >> >> foo (int32x4_t a) >> { >> svint32_t _2; >> >> _2 = VEC_PERM_EXPR <x_1(D), x_1(D), { 0, 1, 2, 3, 0, 1, 2, 3, ... }>; >> return _2; >> } >> >> where the final operand can be built using: >> >> int source_nelts = TYPE_VECTOR_SUBPARTS (…rhs type…).to_constant (); >> vec_perm_builder sel (TYPE_VECTOR_SUBPARTS (…lhs type…), source_nelts, 1); >> for (int i = 0; i < source_nelts; ++i) >> sel.quick_push (i); >> >> I'm not sure how well-tested that combination is though. It might need >> changes to target-independent code. > Hi Richard, > Thanks for the suggestions. > I tried the above approach in attached patch, but it still results in > ICE due to type mismatch: > > pr96463.c: In function ‘foo’: > pr96463.c:8:1: error: type mismatch in ‘vec_perm_expr’ > 8 | } > | ^ > svint32_t > int32x4_t > int32x4_t > svint32_t > _3 = VEC_PERM_EXPR <x_4(D), x_4(D), { 0, 1, 2, 3, ... }>; > during GIMPLE pass: ccp > dump file: pr96463.c.032t.ccp1 > pr96463.c:8:1: internal compiler error: verify_gimple failed > > Should we perhaps add another tree code, that "extends" a fixed-width > vector into it's VLA equivalent ?
No, I think this is just an extreme example of the combination not being well-tested. :-) Obviously it's worse than I thought. I think accepting this kind of VEC_PERM_EXPR is still the way to go. Richi, WDYT? Thanks, Richard