Hi, User directed Function Multiversioning (MV) via Function Overloading ===================================================
I have created a set of patches to add support for user directed function MV via function overloading. This was discussed in this thread previously: http://gcc.gnu.org/ml/gcc-patches/2011-04/msg02285.html Two patches have been created now to support this: * The patch with the front-end changes to support versioned functions is: http://codereview.appspot.com/5752064/ * The patch to add runtime CPU type detection support is here: http://codereview.appspot.com/5754058/ With this support, here is an example of writing a program with function versions: int foo (); /* Default version */ int foo () __attribute__ ((targetv("arch=corei7"))); /*Specialized for corei7 */ int foo () __attribute__ ((targetv("arch=amdfam10"))); /*Specialized for amdfam10 */ int main () { int (*p)() = &foo; return foo () + (*p)(); } int foo () { return 0; } int __attribute__ ((targetv("arch=corei7"))) foo () { ... return 0; } int __attribute__ ((targetv("arch=amdfam10"))) foo () { ... return 0; } The above example has foo defined 3 times, but all 3 definitions of foo are different versions of the same function. The call to foo in main, directly and via a pointer, are calls to the multi-versioned function foo which is dispatched to the right foo at run-time. Function versions must have the same signature but must differ in the specifier string provided to a new attribute called "targetv", which is nothing but the target attribute with an extra specification to indicate a version. Any number of versions can be created using the targetv attribute but it is mandatory to have one function without the attribute, which is treated as the default version. The front-end support is available in this patch: http://codereview.appspot.com/5752064/ The front-end treats multiple definitions of foo with the same signature but with different targetv attributes as legitimate candidates for overloading. Also, all the function versions of one function are grouped together. Then, calls to foo and pointer access of foo will be replaced by an IFUNC function (foo.ifunc) which will call the dispatcher code at run-time to figure out the right version to execute. For the above example, the following functions will be created : * _Z3foov.ifunc : ifunc dispatcher for multi-versioned function foo and aliases to _Z3foov.resolver. All calls and pointer accesses to foo are replaced by an call or pointer access to this function. * _Z3foov.resolver : The code to determine which version to execute at run-time. * _Z3foov : The default version of foo. * _Z3foov.arch_corei7 : The corei7 version of foo. * _Z3foov.arch_amdfam10 : The amdfam10 version of foo. Note that using IFUNC blocks inlining of versioned functions. I had implemented an optimization earlier to do hot path cloning to allow versioned functions to be inlined. Please see : http://gcc.gnu.org/ml/gcc-patches/2011-04/msg02285.html In the next iteration, I plan to merge these two. With that, hot code paths with versioned functions will be cloned so that versioned functions can be inlined. The version dispatch itself happens in a newly created pass added to be one of the initial lowering passes. The pass communicates with the target to determine the appropriate predicates to use to figure out which version to dispatch at run-time. The predicates are target builtins which determine the platform type at run-time and are added in this patch : http://codereview.appspot.com/5754058/ The following features are being developed for the next iteration: 1) Support for hot path cloning to inline versioned functions. 2) Specifying multiple versions in a single function definition. This will be done using the following syntax: int foo () __attribute__ ((targetv (("arch=corei7"),("arch=amdfam10"), ("arch=core2")))); which means the same body of foo must be cloned for corei7, amdfam10, and core2. 3) Specifying ISA types in the attribute. Only "arch=" is supported now. For example, int foo () __attribute__ ((targetv ("popcnt,ssse3"))); means the version is only to be executed when popcount and ssse3 instructions are available. 4) Other dispatching mechanism. IFUNC is used for dispatch, but then the target does not support this dispatching by directly calling the appropriate function version after checking the platform type will be supported. 5) Virtual function versioning. Thoughts? Thanks, -Sri.