https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609
Bug ID: 67609 Summary: [Regression] Generates wrong code for SSE2 _mm_load_pd Product: gcc Version: 5.2.1 Status: UNCONFIRMED Severity: major Priority: P3 Component: regression Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- For this program (needs -msse2 to compile). #include <emmintrin.h> __m128d reg; void set_lower(double b) { double v[2]; _mm_store_pd(v, reg); v[0] = b; reg = _mm_load_pd(v); } On optimization levels -O1 and up, GCC 5.2 incorrectly generates code that destroys the upper half of reg. movapd %xmm0, %xmm1 movaps %xmm1, reg(%rip) On -O0, the bug does not occur. If the index expression is changed into an expression whose value is not known at compile-time, the code will work properly. GCC 4.9 does this correctly (if with bit too much labor): movdqa reg(%rip), %xmm1 movaps %xmm1, -24(%rsp) movsd %xmm0, -24(%rsp) movapd -24(%rsp), %xmm2 movaps %xmm2, reg(%rip) For comparison, Clang 3.4 and 3.5: movlpd %xmm0, reg(%rip) For comparison, Clang 3.6: movaps reg(%rip), %xmm1 movsd %xmm0, %xmm1 movaps %xmm1, reg(%rip)