On Wed, Mar 7, 2012 at 1:42 AM, Sriraman Tallam <tmsri...@google.com> wrote: > Hi, > > User directed Function Multiversioning (MV) via Function Overloading > =================================================== > > I have created a set of patches to add support for user directed > function MV via function overloading. This was discussed in this > thread previously: > http://gcc.gnu.org/ml/gcc-patches/2011-04/msg02285.html > > Two patches have been created now to support this: > * The patch with the front-end changes to support versioned functions is: > http://codereview.appspot.com/5752064/ > > * The patch to add runtime CPU type detection support is here: > http://codereview.appspot.com/5754058/
Please post the patches to gcc-patches. > With this support, here is an example of writing a program with > function versions: > > int foo (); /* Default version */ > int foo () __attribute__ ((targetv("arch=corei7"))); /*Specialized for corei7 > */ > int foo () __attribute__ ((targetv("arch=amdfam10"))); /*Specialized > for amdfam10 */ I don't like specifying 'arch' at all. Instead you _always_ want architecture feature tests, not architecture tests. Because, does amdfam10 also cover bdver1? [it can't! bdver1 does no longer have 3dnow! but that's entirely surprising for a user] Thus, only allow feature specifications. [Why not re-use the existing 'target' attribute?] I'll have a look at the patches once posted. Richard. > > int main () > { > int (*p)() = &foo; > return foo () + (*p)(); > } > > int foo () > { > return 0; > } > > int __attribute__ ((targetv("arch=corei7"))) > foo () > { > ... > return 0; > } > > int __attribute__ ((targetv("arch=amdfam10"))) > foo () > { > ... > return 0; > } > > The above example has foo defined 3 times, but all 3 definitions of > foo are different versions of the same function. The call to foo in > main, directly and via a pointer, are calls to the multi-versioned > function foo which is dispatched to the right foo at run-time. > > Function versions must have the same signature but must differ in the > specifier string provided to a new attribute called "targetv", which > is nothing but the target attribute with an extra specification to > indicate a version. Any number of versions can be created using the > targetv attribute but it is mandatory to have one function without the > attribute, which is treated as the default version. The front-end > support is available in this patch: > http://codereview.appspot.com/5752064/ > > The front-end treats multiple definitions of foo with the same > signature but with different targetv attributes as legitimate > candidates for overloading. Also, all the function versions of one > function are grouped together. Then, calls to foo and pointer access > of foo will be replaced by an IFUNC function (foo.ifunc) which will > call the dispatcher code at run-time to figure out the right version > to execute. For the above example, the following functions will be > created : > > * _Z3foov.ifunc : ifunc dispatcher for multi-versioned function foo and > aliases to _Z3foov.resolver. All calls and pointer accesses to foo are > replaced by an call or pointer access to this function. > * _Z3foov.resolver : The code to determine which version to execute at > run-time. > * _Z3foov : The default version of foo. > * _Z3foov.arch_corei7 : The corei7 version of foo. > * _Z3foov.arch_amdfam10 : The amdfam10 version of foo. > > Note that using IFUNC blocks inlining of versioned functions. I had > implemented an optimization earlier to do hot path cloning to allow > versioned functions to be inlined. Please see : > http://gcc.gnu.org/ml/gcc-patches/2011-04/msg02285.html > In the next iteration, I plan to merge these two. With that, hot code > paths with versioned functions will be cloned so that versioned > functions can be inlined. > > The version dispatch itself happens in a newly created pass added to > be one of the initial lowering passes. The pass communicates with the > target to determine the appropriate predicates to use to figure out > which version to dispatch at run-time. The predicates are target > builtins which determine the platform type at run-time and are added > in this patch : > http://codereview.appspot.com/5754058/ > > The following features are being developed for the next iteration: > > 1) Support for hot path cloning to inline versioned functions. > 2) Specifying multiple versions in a single function definition. > > This will be done using the following syntax: > int foo () > __attribute__ ((targetv (("arch=corei7"),("arch=amdfam10"), ("arch=core2")))); > > which means the same body of foo must be cloned for corei7, amdfam10, and > core2. > > 3) Specifying ISA types in the attribute. Only "arch=" is supported now. > > For example, > int foo () > __attribute__ ((targetv ("popcnt,ssse3"))); > > means the version is only to be executed when popcount and ssse3 > instructions are available. > > 4) Other dispatching mechanism. > > IFUNC is used for dispatch, but then the target does not support this > dispatching by directly calling the appropriate function version after > checking the platform type will be supported. > > 5) Virtual function versioning. > > Thoughts? > > Thanks, > -Sri.