On 26.02.19 19:36, Richard Henderson wrote: > On 2/26/19 3:38 AM, David Hildenbrand wrote: >> We sometimes want to work on a temporary vector register instead of the >> actual destination, because source and destination might overlap. An >> alternative would be loading the vector into two i64 variables, but than >> separate handling for accessing the vector elements would be needed. >> This is easier. Add one for now as that seems to be enough. > > Hmm, I'll reserve judgment until I see how this is used. > > For ARM SVE, I would allocate this temporary on the stack within the helper, > and move one of the operands out of the way. E.g.
Yes, I do the same for helpers. This, however is for TCG translated code :) E.g. see [PATCH v1 08/33] s390x/tcg: Implement VECTOR LOAD [PATCH v1 19/33] s390x/tcg: Implement VECTOR MERGE (HIGH|LOW) [PATCH v1 33/33] s390x/tcg: Implement VECTOR UNPACK * > > void helper(foo)(void *vd, void *vx, *void *vy > { > VectorReg tmp; > TYPE *d = vd, *x = vx, *y = vy; > > if (vx == vd || vy == vd) { > tmp = *(VectorReg *)vd; > if (vx == vd) { > vx = &tmp; > } > if (vy == vd) { > vy = &tmp; > } > } > > process d, x, y as normal. > } > > This minimized the amount of code inline. However, SVE vectors are quite a > bit > larger, at 256 bytes, so the copy itself was out of line most of the time > anyway. > > Provisionally, > Reviewed-by: Richard Henderson <richard.hender...@linaro.org> > > > r~ > -- Thanks, David / dhildenb