https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83411
Bug ID: 83411 Summary: function multiversioning should clone the entire sub-callgraph Product: gcc Version: 8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: h2+bugs at fsfe dot org Target Milestone: --- [is this component entry correct?] The documentation of FMV states: This aim of this project is to make it really easy for the developer to specify multiple versions of a function, each catered to a specific target ISA feature. GCC then takes care of creating the dispatching code necessary to execute the right function version. This sounds really cool, but in practice there is the huge problem that FMV does not apply to nested functions calls forcing the developer to move the FMV to the bottom of the callgraph incurring a possibly huge run-time penalty due to calling the dispatch ridiculously often. I have described this problem in detail on my blog¹, but I think it should be quite evident to you. I would humbly suggest adding something like __attribute__((target_clone_trees("default", "popcnt"))) that recursively also clones the nested function calls (without additional dispatch steps and working independent of inlining or not). For optimisations that are then generated automatically or by builtins this would already solve all my problems :) For manually curated SIMD code however this touches on another problem with current FMV: one cannot rely on macros for feature detection. To fix this one needs another mechanism to be able to find out which features are available for the code block currently being compiled. I am not a compiler expert, but I guess it is not possible to make macros work, simply because they are evaluated much earlier. But could the compiler provide constexpr feature variables to the code? Then one could simply "clone_tree" early and "if constexpr (__has_sse4) ... else ..." in the actual funcion. Thanks for reading all of this and you work on GCC in general! ¹ https://hannes.hauswedell.net/post/2017/12/09/fmv/