On Fri, Jan 08, 2016 at 01:27:08PM -0800, H.J. Lu wrote: > > That is just wrong and will severely pessimize correct code. > > Please don't waste time on that. > > > > Do you have an example to show it will pessimize correct code?
Anything where the compiler can't figure out alignment info and you use the aligned functions, starting from trivial tests like: void foo (float *p, __m256 q) { _mm256_store_ps (p, q); } etc. The SSE*/AVX* docs say clearly that if the pointer argument is not properly aligned, a #GP is generated, and you need to use the *storeu*/*loadu* intrinsics instead. Generally, the GCC middle-end does not infer alignment info from mere existence of pointers, but from memory accesses - and the _mm*_{load,store}* intrinsics count as memory accesses, but they are represented as builtins that take a pointer argument, and only at the RTL level the memory load or store is visible in the IL. Which is the reason for the align_mem stuff, there are no MEM_REFs at the GIMPLE level, the MEMs are created when expanding those intrinsics, and for intrinsics where user asserts certain alignment the align_mem stuff is exactly what lets the optimizers know about the user choice. Otherwise there would be really no difference between using _mm256_store_ps and _mm256_storeu_ps. The only difference between is the assertion that the memory in correct programs is properly aligned in the non-u versions. Jakub