On Aug 26, 2013, at 8:21 AM, Jan Hubicka <[email protected]> wrote:
> My understanding of C++ One Definition Rule, in a strict sense, does not a
> allow in to define two functions of the same name and different semantics in a
> valid program . I also think that all DSOs eventually linked together or
> dlopenned are part of the same program. So theoretically, for C++ produced
> code, we may go with AVAIL_AVAILABLE everywhere.
So, I think you're on firm ground wrt the standard. I think LTO naturally
wants see and make use of semantics, and once you accept that as valid, which,
reasonably it is, I think you get to see and understand quite a lot about the
code. Replacing anything comes with a heavy constraint that it is reasonably
the same and the user will die if it is not. When an allocation function that
the LTO optimizer can see is 32 byte aligned on the returned pointer, it is
reasonable to make use of this on the client side code-gen. If the user
replaces that allocation function with one that was not 32-byte aligned, bad
things would happen.
I think what the optimizer can see is open ended, and any use it wants to make
of what it sees is fine. Functions, data, classes, methods, ctors, dtors,
templates, everything.
Now, that the standard perspective. From the QOI viewpoint, you will have
users that want to do various things, and they should explain what they want,
and we should document and vend them the good stuff. I defer to the
interposing types on what they want to do and why. Roughly, they need a way to
hide things from the optimizer. Assembly can do this, but, we'd also want to
hide (or mark as please don't peer into) any function, method or variable.
Separate compilation not using the -flto flag seems a reasonable way to do
this. I don't know if it is enough… I think those types of people will scream
when they discover they want more control.
> On IRC we got into an agreement that we may disallow interposition for
> virtuals,
Hum… I'm not one of those people that want to interpose virtuals, but as a
tool vendor, it would seem like some would like to be able to interpose
virtuals. I think separate compilation with no -flto should be enough to hide
enough details to make the interposition of virtuals possible. For example,
someone has a nice C++ abi that includes a virtual function for open, and one
wants to interpose open to trace it to merely help debug a problem. Doesn't
strike me as wrong.
For comdat (template functions), I can't help but think having a way to mark
definitions as, please don't peer into this, would be nice to have. One can
separate declaration and definition and explicitly instantiate, but doing this
might be a pain. I'd defer, again, to the interposers.
Now, when the cost of allowing interposing is high (dynamic relocs for
example), disallowing interposition by default is fine, not arguing that one
must always have the cost. Just seems nice from a theoretic perspective to
allow the user to say, yes, we do want to allow interposing on these virtuals.
> Does the following patch seems sane?
Easier to review the change in semantics of a sample bit of code… I think I
understand the effects of the change.
> Of course I would be happier with a stronger rule - for instance allowing
> interposition only on plain functions not on methods.
Hum, I like the orthogonal rules that apply generally. Meaning, I don't like
the notion of treating functions and methods (or virtual methods) differently.
For example, a don't peer into for a template function definition, should be
used to not peer into a normal inline function.
I think I like letting the optimizer do anything, and making the user
responsible for not using -flto, or ensuring enough separate compilation, or
otherwise marking the boundaries that don't want to peer though… I could also
be burned alive by a linux distributor with existing code if I tried this… :-)
Good luck.
Oh, so keep in mind, if you do something like
template <class T>
class actor {
ctor() {
static int i = 100;
printf("%p\n", &i);
}
};
and don't smash all the ctors together, you (can) wind up with multiple ctor::i
objects. The standard says there is one of them. The usual way to ensure
there is only one of them is to collapse all the the ctors together into one,
then, trivially, that one can only reference one of the i's that exist. This
is one way (beyond equality) to tell if there are multiples (if the function is
replicated). Trying to think if there were any other ways… ah yes, here it is:
6 A static local variable in a member function always refers to the same
object, whether or not the member function is inline.
A static local variable in an extern inline function always refers to
the same object. A string literal in an extern inline function is the
same object in different translation units.
So, string literals can also be used to notice the uniqueness of the function
(method). Curious, we didn't do the same for string literals in an member
function, not sure why, feels like an oversight, not on purpose. I'd have to
dig through all the papers to find when it went in, and the paper that brought
it in to see. I don't recall that we talked about making it different.