https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79830

--- Comment #4 from Petr <kobalicek.petr at gmail dot com> ---
I think the test-case can be simplified to the following code. It still suffers
from the same issues as mentioned above.

#include <stdint.h>

#if defined(_MSC_VER)
# include <intrin.h>
#else
# include <x86intrin.h>
#endif

void transform(double* dst, const double* src, const double* matrix, size_t
length) {
  intptr_t i = static_cast<intptr_t>(length);
  while ((i -= 2) >= 0) {
    __m256d s0 = _mm256_loadu_pd(src);
    _mm256_storeu_pd(dst, _mm256_add_pd(s0, s0));

    dst += 4;
    src += 4;
  }

  if (i & 1) {
    __m128d s0 = _mm_loadu_pd(src);
    _mm_storeu_pd(dst, _mm_add_pd(s0, s0));
  }
}

Reply via email to