On 4/25/22 23:33, Richard Henderson wrote:
I do not think it worthwhile to unroll these loops by hand.
Totally agree, as it would also remove most of the uses of XMM_ONLY/YMM_ONLY.
I also saw GCC -Warray-bounds complain about if (SHIFT >= 1) { d->elem[8] = s->elem[8]; } though this should probably treated as a GCC bug. Paolo
If we're that keen on it, it should be written #pragma GCC unroll 4 << SHIFT for (i = 0; i < 4 << SHIFT; ++i) { something }