On Tue, Sep 11, 2012 at 10:41 AM, Richard Guenther <richard.guent...@gmail.com> wrote: > On Mon, Sep 10, 2012 at 6:37 PM, Richard Henderson <r...@redhat.com> wrote: >> On 09/10/2012 09:09 AM, Iyer, Balaji V wrote: >>>> >If that's the case, what's the point in defining an external ABI and >>>> >defining what >>>> >__attribute__((vector)) placed on a function declaration means? >> >>> When you have __attribute__((vector)) you are asking the compiler to >>> create a vector AND a scalar version of the function. The advantage >>> is that if the function is used, for example, in 2 loops where 1 can >>> be vectorized and another cannot, the vectorizable loop won't suffer >>> (i.e. suffer from being not-vectorized). >> >> You've totally mis-understood my point. >> >> Whether or not the compiler creates a clone COULD BE totally up to the >> compiler, based on whether or not vectorization is enabled, whether the >> loop has been analyzed such that vectorization may proceed, or indeed >> the phase of the moon. >> >> But in order for that to happen, the clone must be totally private to >> the module for which we are generating code (in the LTO sense, this is >> the entire program or dll; without LTO, this is just the object file). >> It means that we never attempt to generate clones for functions for >> which the body of the function is not visible. >> >> On the other hand, if you insist on assuming a clone exists merely >> because a declaration bears an attribute, then you must address ALL >> of the problems with respect to defining a stable ABI in the face of >> different cpu revisions, different ISAs, and different vector lengths. >> >> I've not seen you address ANY of these problems, despite having the >> problem pointed out multiple times. > > Indeed, if the definition of an elemental function is always visible to the > vectorizer the vectorizer itself can instruct the creation of the clone > if it does not already exist (just make those clones managed by the > callgraph). Then the clones are visible to the current TU only and no > ABI issues exist (though you could say that the vectorizer or the inliner > could as well force inlining of elemental functions into places it wants to > vectorize - one complication even with local clones is that the x86 ABI > has no callee-saved XMM registers which makes function calls inside > loops especially expensive).
Btw, this then happily fits into my suggestion that the "elementalness" can be autodetected by the compiler simply by means of a proper IPA pass and thus be fully LTO / whole-program aware. No need for an attribute (where you'd need to handle the case that the attribute was placed there by error). Richard. > Richard. > >> >> r~