On Wed, Aug 17, 2011 at 6:37 PM, Xinliang David Li <davi...@google.com> wrote:
> On Wed, Aug 17, 2011 at 8:12 AM, Richard Guenther
> <richard.guent...@gmail.com> wrote:
>> On Wed, Aug 17, 2011 at 4:52 PM, Xinliang David Li <davi...@google.com> 
>> wrote:
>>> The gist of previous discussion is to use function overloading instead
>>> of exposing underlying implementation such as builtin_dispatch to the
>>> user. This new refined proposal has not changed in that, but is more
>>> elaborate on various use cases which has been carefully thought out.
>>> Please be specific on which part needs to improvement.
>>
>> See below ...
>>
>>> Thanks,
>>>
>>> David
>>>
>>> On Wed, Aug 17, 2011 at 12:29 AM, Richard Guenther
>>> <richard.guent...@gmail.com> wrote:
>>>> On Tue, Aug 16, 2011 at 10:37 PM, Sriraman Tallam <tmsri...@google.com> 
>>>> wrote:
>>>>> Hi,
>>>>>
>>>>>  I am working on supporting function multi-versioning in GCC and here
>>>>> is a write-up on its usability.
>>>>>
>>>>> Multiversioning Usability
>>>>> ====================
>>>>>
>>>>> For a simple motivating example,
>>>>>
>>>>> int
>>>>> find_popcount(unsigned int i)
>>>>> {
>>>>>  return __builtin_popcount(i);
>>>>> }
>>>>>
>>>>> Currently, compiling this with -mpopcnt will result in the “popcnt”
>>>>> instruction being used and otherwise call a built-in generic
>>>>> implementation. It is desirable to have two versions of this function
>>>>> so that it can be run both on targets that support the popcnt insn and
>>>>> those that do not.
>>>>>
>>>>>
>>>>> * Case I - User Guided Versioning where only one function body is
>>>>> provided by the user.
>>>>>
>>>>> This case addresses a use where the user wants multi-versioning but
>>>>> provides only one function body.  I want to add a new attribute called
>>>>> “mversion” which will be used like this:
>>>>>
>>>>> int __attribute__(mversion(“popcnt”))
>>>>> find_popcount(unsigned int i)
>>>>> {
>>>>>  return __builtin_popcount(i);
>>>>> }
>>>>>
>>>>> With the attribute, the compiler should understand that it should
>>>>> generate two versions for this function. The user makes a call to this
>>>>> function like a regular call but the code generated would call the
>>>>> appropriate function at run-time based on a check to determine if that
>>>>> instruction is supported or not.
>>
>> The example seems to be particularly ill-suited.  Trying to 2nd guess you
>> here I think you want to direct the compiler to emit multiple versions
>> with different target capabilities enabled, probably for elaborate code that
>> _doesn't_ use any fancy builtins, right?  It seems this is a shortcut for
>>
>> static inline __attribute__((always_iniline)) implementation () { ... }
>>
>> symbol __attribute__((target("msse2"))) { implementation(); }
>> symbol __attribute__((target("msse3"))) { implementation(); }
>> ...
>>
>> and so should be fully handled by the frontend (if at all, it seems to
>> be purely syntactic sugar).
>
> Yes, it is a handy short cut -- I don't see the base for objection to
> this convenience.

And I don't see why we need to discuss it at this point.  It also seems
severely limited considering when I want to version for -msse2 -mpopcount
and -msse4a - that doesn't look expressible.  A more elaborate variant
would be, say,

foo () { ... };
foo __attribute__((version("sse2","popcount")));
foo __attribute__((version("sse4a")));

thus trigger a overload clone by a declaration as well, not just by a
definition, similar to an explicit template instantiation.  That sounds more
scalable to me.

>>
>>>>> The attribute can be scaled to support many versions but allowing a
>>>>> comma separated list of values for the mversion attribute. For
>>>>> instance, “__attribute__(mversion(“sse3”, “sse4”, ...)) will provide a
>>>>> version for each. For N attributes, N clones plus one clone for the
>>>>> default case will have to be generated by the compiler. The arguments
>>>>> to the "mversion" attribute will be similar to the arguments supported
>>>>> by the "target" attribute.
>>>>>
>>>>> This attribute is useful if the same source is going to be used to
>>>>> generate the different versions. If this has to be done manually, the
>>>>> user has to duplicate the body of the function and specify a target
>>>>> attribute of “popcnt” on one clone. Then, the user has to use
>>>>> something like IFUNC support or manually write code to call the
>>>>> appropriate version. All of this will be done automatically by the
>>>>> compiler with this new attribute.
>>>>>
>>>>> * Case II - User Guided Versioning where the function bodies for each
>>>>> version differ and is provided by the user.
>>>>>
>>>>> This case pertains to multi-versioning when the source bodies of the
>>>>> two or more versions are different and are provided by the user. Here
>>>>> too, I want to use a new attribute, “version”. Now, the user can
>>>>> specify versioning intent like this:
>>>>>
>>>>> int __attribute__((version(“popcnt”))
>>>>> find_popcnt(unsigned int i)
>>>>> {
>>>>>   // inline assembly of the popcnt instruction, specialized version.
>>>>>  asm(“popcnt ….”);
>>>>> }
>>>>>
>>>>> int
>>>>> find_popcnt(unsigned int i)
>>>>> {
>>>>>  //generic code for doing this
>>>>>  ...
>>>>> }
>>>>>
>>>>> This uses function overloading to specify versions.  The compiler will
>>>>> understand that versioning is requested, since the functions have
>>>>> different attributes with "version", and will generate the code to
>>>>> execute the right function at run-time.  The compiler should check for
>>>>> the existence of one body without the attribute which will be the
>>>>> default version.
>>
>> Yep, we agreed that this is a good idea.  But we also agreed to
>> use either the target attribute (for compiler-generated tests) or
>> a single predicate attribute that takes a function which is const
>> with no arguments and returns whether the variant is selected or not.
>
> 'target' attribute is an existing one, so adding overloading changes
> its semantics -- that is why a new 'version' attribute is proposed.
> For most of the cases, user does not need to provide his selector
> function, and compiler can use runtime support (builtins) to do the
> selection (See Sri's runtime patch).

Sure, I just want to make sure we re-use the same infrastructure
for both.

> For power users, yes, the original agreed proposal is useful. The
> flavor of syntax that supports selector can be added back.

Well, you said it was definitely required ;)  I had my doubts that it
would be relevant in practice, so we can as well leave it out.

>>
>>>>> * Case III - Versioning is done automatically by the compiler.
>>>>>
>>>>> I want to add a new compiler flag “-mversion” along the lines of “-m”.
>>>>> If the user specifies “-mversion=popcnt” then the compiler will
>>>>> automatically create two versions of any function that is impacted by
>>>>> the new instruction. The difference between “-m” and “-mversion” will
>>
>> How do you plan to detect "impacted by the new instruction?".  Again
>> popcnt seems to be a poor example - most use probably lies in
>> autovectorization (but then it's closely tied to active capabilites of the
>> backend and not really ready for auto-versioning).
>>
>
> This is just an example. Major use cases involves versioning against
> the cpu model, such as core2, corei7, amd15h, etc. It has impact on
> decisions on code layout, unrolling, vectorization, scheduling, etc,
> but then again, this (MV heuristic) is a whole different topic.  The
> discussion here is about infrastructure.

I don't see how auto-MV has any impact on the infrastructure, so we might
as well postpone any discussion until the infrastructure is set.

Richard.

> Thanks,
>
> David
>
>> This will be a lot of work if it shouldn't be very inefficient.
>>
>> Richard.
>>
>>>>> be that while “-m” generates only the specialized version, “-mversion”
>>>>> will generate both the specialized and the generic versions.  There is
>>>>> no need to explicity mark any function for versioning, no source
>>>>> changes.
>>>>>
>>>>> The compiler will decide if it is beneficial to multi-version a
>>>>> function based on heuristics using hotness information, code size
>>>>> growth, etc.
>>>>>
>>>>>
>>>>> Runtime support
>>>>> ===============
>>>>>
>>>>> In order for the compiler to generate multi-versioned code, it needs
>>>>> to call functions that would test if a particular feature exists or
>>>>> not at run-time. For example, IsPopcntSupported() would be one such
>>>>> function. I have prepared a patch to do this which adds the runtime
>>>>> support in libgcc and supports new builtins to test the various
>>>>> features. I will send the patch separately to keep the dicussions
>>>>> focused.
>>>>>
>>>>>
>>>>> Thoughts?
>>>>
>>>> Please focus on one mechanism and re-use existing facilities as much as
>>>> possible.  Thus, see the old discussion where we settled on overloading
>>>> with either using the existing target attribute or a selector function.
>>>> I don't see any benefit repeating the discussions here.
>>>>
>>>> Richard.
>>>>
>>>>> Thanks,
>>>>> -Sri.
>>>>>
>>>>
>>>
>>
>

Reply via email to