https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83203
Bug ID: 83203
Summary: Inefficient int to avx2 vector conversion
Product: gcc
Version: 7.2.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: zoltan at hidvegi dot com
Target Milestone: ---
Target: x86_64-*-*
__m256i foo(long x) { return (__m256i){x}; }
gcc -mavx2 -O2 generates
20: c5 f9 ef c0 vpxor %xmm0,%xmm0,%xmm0
24: c4 e3 f9 22 c7 00 vpinsrq $0x0,%rdi,%xmm0,%xmm0
2a: c5 f9 6f c0 vmovdqa %xmm0,%xmm0
It should just use vmovq %rdi,%xmm0
Workaround is to use _mm256_castsi128_si256((__m128i){x})