https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123149
Bug ID: 123149
Summary: Missed optimization: GCC doesn't generate AVX512VL
rotate instructions for SSE2/AVX2 or/shift/shift
Product: gcc
Version: 15.2.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: lloyd at randombit dot net
Target Milestone: ---
SSE2/AVX2 does not have bitwise rotation operations so you must use OR/XOR plus
two shifts ala
__m128i rot5(__m128i v) {
return _mm_xor_si128(_mm_slli_epi32(v, 5), _mm_srli_epi32(v, 27));
}
__m256i rot5(__m256i v) {
return _mm256_xor_si256(_mm256_slli_epi32(v, 5), _mm256_srli_epi32(v, 27));
}
However when compiling with AVX-512VL support it's instead possible to use
vprold for the above functions. Clang performs this optimization, GCC does not.
Godbolt link: https://godbolt.org/z/hTveKrv7e