On Wed, Aug 17, 2011 at 6:37 PM, Xinliang David Li <davi...@google.com> wrote: > On Wed, Aug 17, 2011 at 8:12 AM, Richard Guenther > <richard.guent...@gmail.com> wrote: >> On Wed, Aug 17, 2011 at 4:52 PM, Xinliang David Li <davi...@google.com> >> wrote: >>> The gist of previous discussion is to use function overloading instead >>> of exposing underlying implementation such as builtin_dispatch to the >>> user. This new refined proposal has not changed in that, but is more >>> elaborate on various use cases which has been carefully thought out. >>> Please be specific on which part needs to improvement. >> >> See below ... >> >>> Thanks, >>> >>> David >>> >>> On Wed, Aug 17, 2011 at 12:29 AM, Richard Guenther >>> <richard.guent...@gmail.com> wrote: >>>> On Tue, Aug 16, 2011 at 10:37 PM, Sriraman Tallam <tmsri...@google.com> >>>> wrote: >>>>> Hi, >>>>> >>>>> I am working on supporting function multi-versioning in GCC and here >>>>> is a write-up on its usability. >>>>> >>>>> Multiversioning Usability >>>>> ==================== >>>>> >>>>> For a simple motivating example, >>>>> >>>>> int >>>>> find_popcount(unsigned int i) >>>>> { >>>>> return __builtin_popcount(i); >>>>> } >>>>> >>>>> Currently, compiling this with -mpopcnt will result in the “popcnt” >>>>> instruction being used and otherwise call a built-in generic >>>>> implementation. It is desirable to have two versions of this function >>>>> so that it can be run both on targets that support the popcnt insn and >>>>> those that do not. >>>>> >>>>> >>>>> * Case I - User Guided Versioning where only one function body is >>>>> provided by the user. >>>>> >>>>> This case addresses a use where the user wants multi-versioning but >>>>> provides only one function body. I want to add a new attribute called >>>>> “mversion” which will be used like this: >>>>> >>>>> int __attribute__(mversion(“popcnt”)) >>>>> find_popcount(unsigned int i) >>>>> { >>>>> return __builtin_popcount(i); >>>>> } >>>>> >>>>> With the attribute, the compiler should understand that it should >>>>> generate two versions for this function. The user makes a call to this >>>>> function like a regular call but the code generated would call the >>>>> appropriate function at run-time based on a check to determine if that >>>>> instruction is supported or not. >> >> The example seems to be particularly ill-suited. Trying to 2nd guess you >> here I think you want to direct the compiler to emit multiple versions >> with different target capabilities enabled, probably for elaborate code that >> _doesn't_ use any fancy builtins, right? It seems this is a shortcut for >> >> static inline __attribute__((always_iniline)) implementation () { ... } >> >> symbol __attribute__((target("msse2"))) { implementation(); } >> symbol __attribute__((target("msse3"))) { implementation(); } >> ... >> >> and so should be fully handled by the frontend (if at all, it seems to >> be purely syntactic sugar). > > Yes, it is a handy short cut -- I don't see the base for objection to > this convenience.
And I don't see why we need to discuss it at this point. It also seems severely limited considering when I want to version for -msse2 -mpopcount and -msse4a - that doesn't look expressible. A more elaborate variant would be, say, foo () { ... }; foo __attribute__((version("sse2","popcount"))); foo __attribute__((version("sse4a"))); thus trigger a overload clone by a declaration as well, not just by a definition, similar to an explicit template instantiation. That sounds more scalable to me. >> >>>>> The attribute can be scaled to support many versions but allowing a >>>>> comma separated list of values for the mversion attribute. For >>>>> instance, “__attribute__(mversion(“sse3”, “sse4”, ...)) will provide a >>>>> version for each. For N attributes, N clones plus one clone for the >>>>> default case will have to be generated by the compiler. The arguments >>>>> to the "mversion" attribute will be similar to the arguments supported >>>>> by the "target" attribute. >>>>> >>>>> This attribute is useful if the same source is going to be used to >>>>> generate the different versions. If this has to be done manually, the >>>>> user has to duplicate the body of the function and specify a target >>>>> attribute of “popcnt” on one clone. Then, the user has to use >>>>> something like IFUNC support or manually write code to call the >>>>> appropriate version. All of this will be done automatically by the >>>>> compiler with this new attribute. >>>>> >>>>> * Case II - User Guided Versioning where the function bodies for each >>>>> version differ and is provided by the user. >>>>> >>>>> This case pertains to multi-versioning when the source bodies of the >>>>> two or more versions are different and are provided by the user. Here >>>>> too, I want to use a new attribute, “version”. Now, the user can >>>>> specify versioning intent like this: >>>>> >>>>> int __attribute__((version(“popcnt”)) >>>>> find_popcnt(unsigned int i) >>>>> { >>>>> // inline assembly of the popcnt instruction, specialized version. >>>>> asm(“popcnt ….”); >>>>> } >>>>> >>>>> int >>>>> find_popcnt(unsigned int i) >>>>> { >>>>> //generic code for doing this >>>>> ... >>>>> } >>>>> >>>>> This uses function overloading to specify versions. The compiler will >>>>> understand that versioning is requested, since the functions have >>>>> different attributes with "version", and will generate the code to >>>>> execute the right function at run-time. The compiler should check for >>>>> the existence of one body without the attribute which will be the >>>>> default version. >> >> Yep, we agreed that this is a good idea. But we also agreed to >> use either the target attribute (for compiler-generated tests) or >> a single predicate attribute that takes a function which is const >> with no arguments and returns whether the variant is selected or not. > > 'target' attribute is an existing one, so adding overloading changes > its semantics -- that is why a new 'version' attribute is proposed. > For most of the cases, user does not need to provide his selector > function, and compiler can use runtime support (builtins) to do the > selection (See Sri's runtime patch). Sure, I just want to make sure we re-use the same infrastructure for both. > For power users, yes, the original agreed proposal is useful. The > flavor of syntax that supports selector can be added back. Well, you said it was definitely required ;) I had my doubts that it would be relevant in practice, so we can as well leave it out. >> >>>>> * Case III - Versioning is done automatically by the compiler. >>>>> >>>>> I want to add a new compiler flag “-mversion” along the lines of “-m”. >>>>> If the user specifies “-mversion=popcnt” then the compiler will >>>>> automatically create two versions of any function that is impacted by >>>>> the new instruction. The difference between “-m” and “-mversion” will >> >> How do you plan to detect "impacted by the new instruction?". Again >> popcnt seems to be a poor example - most use probably lies in >> autovectorization (but then it's closely tied to active capabilites of the >> backend and not really ready for auto-versioning). >> > > This is just an example. Major use cases involves versioning against > the cpu model, such as core2, corei7, amd15h, etc. It has impact on > decisions on code layout, unrolling, vectorization, scheduling, etc, > but then again, this (MV heuristic) is a whole different topic. The > discussion here is about infrastructure. I don't see how auto-MV has any impact on the infrastructure, so we might as well postpone any discussion until the infrastructure is set. Richard. > Thanks, > > David > >> This will be a lot of work if it shouldn't be very inefficient. >> >> Richard. >> >>>>> be that while “-m” generates only the specialized version, “-mversion” >>>>> will generate both the specialized and the generic versions. There is >>>>> no need to explicity mark any function for versioning, no source >>>>> changes. >>>>> >>>>> The compiler will decide if it is beneficial to multi-version a >>>>> function based on heuristics using hotness information, code size >>>>> growth, etc. >>>>> >>>>> >>>>> Runtime support >>>>> =============== >>>>> >>>>> In order for the compiler to generate multi-versioned code, it needs >>>>> to call functions that would test if a particular feature exists or >>>>> not at run-time. For example, IsPopcntSupported() would be one such >>>>> function. I have prepared a patch to do this which adds the runtime >>>>> support in libgcc and supports new builtins to test the various >>>>> features. I will send the patch separately to keep the dicussions >>>>> focused. >>>>> >>>>> >>>>> Thoughts? >>>> >>>> Please focus on one mechanism and re-use existing facilities as much as >>>> possible. Thus, see the old discussion where we settled on overloading >>>> with either using the existing target attribute or a selector function. >>>> I don't see any benefit repeating the discussions here. >>>> >>>> Richard. >>>> >>>>> Thanks, >>>>> -Sri. >>>>> >>>> >>> >> >