https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123149

            Bug ID: 123149
           Summary: Missed optimization: GCC doesn't generate AVX512VL
                    rotate instructions for SSE2/AVX2 or/shift/shift
           Product: gcc
           Version: 15.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: lloyd at randombit dot net
  Target Milestone: ---

SSE2/AVX2 does not have bitwise rotation operations so you must use OR/XOR plus
two shifts ala

__m128i rot5(__m128i v) {
    return _mm_xor_si128(_mm_slli_epi32(v, 5), _mm_srli_epi32(v, 27));
}

__m256i rot5(__m256i v) {
    return _mm256_xor_si256(_mm256_slli_epi32(v, 5), _mm256_srli_epi32(v, 27));
}

However when compiling with AVX-512VL support it's instead possible to use
vprold for the above functions. Clang performs this optimization, GCC does not.

Godbolt link: https://godbolt.org/z/hTveKrv7e

Reply via email to