https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85572
Bug ID: 85572 Summary: faster code for absolute value of __v2di Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- The absolute value for 64-bit integer SSE vectors is only optimized when AVX512VL is available. Test case (`-O2 -ffast-math` and one of -mavx512vl, -msse4, or -msse2): #include <x86intrin.h> __v2di abs(__v2di x) { return x < 0 ? -x : x; } With SSE4 I suggest: abs(long long __vector(2)): pxor %xmm1, %xmm1 pcmpgtq %xmm0, %xmm1 pxor %xmm1, %xmm0 psubq %xmm1, %xmm0 ret in C++: auto neg = reinterpret_cast<__v2di>(x < 0); return (x ^ neg) - neg; Without SSE4: abs(long long __vector(2)): movdqa %xmm0, %xmm2 pxor %xmm1, %xmm1 psrlq $63, %xmm2 psubq %xmm2, %xmm1 pxor %xmm1, %xmm0 paddq %xmm2, %xmm0 ret in C++: auto neg = -reinterpret_cast<__v2di>(reinterpret_cast<__v2du>(x) >> 63); return (x ^ neg) - neg; related issue for scalars: #67510