https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82793
Bug ID: 82793 Summary: __attribute__((target("sse"))) causes call throught ifunc Product: gcc Version: 7.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vorfeed.canal at gmail dot com Target Milestone: --- The following example illustrates the problem: #include "xmmintrin.h" extern __attribute__((target("avx"),visibility("hidden"))) __m128 foo(__m128 a, __m128 b); extern __attribute__((target("sse4.2"),visibility("hidden"))) __m128 foo(__m128 a, __m128 b); extern __attribute__((target("default"),visibility("hidden"))) __m128 foo(__m128 a, __m128 b); __attribute__((target("sse4.2"))) __m128 bar(__m128 a, __m128 b) { return foo(a, b); } __attribute__((target("avx"))) __m128 bar(__m128 a, __m128 b) { return foo(a, b); } __attribute__((target("default"))) __m128 bar(__m128 a, __m128 b) { return foo(a, b); } All versions of GCC which I've tried produced the following code: ... .type _Z3barDv4_fS_.avx, @function _Z3barDv4_fS_.avx: .LFB526: .cfi_startproc jmp _Z3fooDv4_fS_.avx .cfi_endproc ... .type _Z3barDv4_fS_.sse4.2, @function _Z3barDv4_fS_.sse4.2: .LFB525: .cfi_startproc jmp _Z19_Z3fooDv4_fS_.ifuncDv4_fS_ .cfi_endproc That is: AVX functions are calling AVX functions directly while SSE4.2 functions call SSE4.2 functions via ifunc-resolver. Needless to day this kills the performance quite throughly - and this IS the reason to use function multiversioning with AVS/SSE attributes!