https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69908

--- Comment #6 from Marc Glisse <glisse at gcc dot gnu.org> ---
(In reply to Yuri Gribov from comment #5)
> Well, as we all know there are a lot of missing optimizations in GCC :) I
> think the real question is whether it's ever going to be fixed if there's no
> standard API for this code pattern which we can recognize as builtin.
> 
> I believe the answer is "No". ATM GCC does not vectorize even the simplest
> memcpy equivalent code:
>   // gcc tmp.c -O3 -mtune=native -ftree-vectorize -o- -S
>   void memcpy_(char * __restrict a, char * __restrict b, unsigned n) {
>     unsigned i;
>     for (i = 0; i < n; ++i)
>       a[i] = b[i];
>   }

Please look again. ldist turns this into a call to memcpy. And if you disable
ldist, it does get vectorized.

Reply via email to