On 4/22/20 9:55 AM, Stephen Long wrote: > + intptr_t opr_sz = simd_oprsz(desc) / (sizeof(TYPE) >> 2); \ > + \ > + for (s = 0; s < opr_sz; ++s) { \ > + TYPE *n = vn + s * (sizeof(TYPE) >> 2); \ > + TYPE *m = vm + s * (sizeof(TYPE) >> 2); \ > + TYPE *a = va + s * (sizeof(TYPE) >> 2); \ > + TYPE *d = vd + s * (sizeof(TYPE) >> 2); \
Shifting the wrong way. Need to multiply by 4 not divide. I've fixed this up, and also expanded the macro to two functions; I think it's clearer that way in this case. Applied to my SVE2 branch. Thanks, r~