Re: ELF interposition and One Definition Rule

Mike Stump Mon, 26 Aug 2013 15:30:43 -0700

On Aug 26, 2013, at 8:21 AM, Jan Hubicka <[email protected]> wrote:
> My understanding of C++ One Definition Rule, in a strict sense, does not a
> allow in to define two functions of the same name and different semantics in a
> valid program . I also think that all DSOs eventually linked together or
> dlopenned are part of the same program.  So theoretically, for C++ produced
> code, we may go with AVAIL_AVAILABLE everywhere.


So, I think you're on firm ground wrt the standard.  I think LTO naturally 
wants see and make use of semantics, and once you accept that as valid, which, 
reasonably it is, I think you get to see and understand quite a lot about the 
code.  Replacing anything comes with a heavy constraint that it is reasonably 
the same and the user will die if it is not.  When an allocation function that 
the LTO optimizer can see is 32 byte aligned on the returned pointer, it is 
reasonable to make use of this on the client side code-gen.  If the user 
replaces that allocation function with one that was not 32-byte aligned, bad 
things would happen.

I think what the optimizer can see is open ended, and any use it wants to make 
of what it sees is fine.  Functions, data, classes, methods, ctors, dtors, 
templates, everything.

Now, that the standard perspective.  From the QOI viewpoint, you will have 
users that want to do various things, and they should explain what they want, 
and we should document and vend them the good stuff.  I defer to the 
interposing types on what they want to do and why.  Roughly, they need a way to 
hide things from the optimizer.  Assembly can do this, but, we'd also want to 
hide (or mark as please don't peer into) any function, method or variable.  
Separate compilation not using the -flto flag seems a reasonable way to do 
this.  I don't know if it is enough… I think those types of people will scream 
when they discover they want more control.

> On IRC we got into an agreement that we may disallow interposition for
> virtuals,

Hum…  I'm not one of those people that want to interpose virtuals, but as a 
tool vendor, it would seem like some would like to be able to interpose 
virtuals.  I think separate compilation with no -flto should be enough to hide 
enough details to make the interposition of virtuals possible.  For example, 
someone has a nice C++ abi that includes a virtual function for open, and one 
wants to interpose open to trace it to merely help debug a problem.  Doesn't 
strike me as wrong.

For comdat (template functions), I can't help but think having a way to mark 
definitions as, please don't peer into this, would be nice to have.  One can 
separate declaration and definition and explicitly instantiate, but doing this 
might be a pain.  I'd defer, again, to the interposers.

Now, when the cost of allowing interposing is high (dynamic relocs for 
example), disallowing interposition by default is fine, not arguing that one 
must always have the cost.  Just seems nice from a theoretic perspective to 
allow the user to say, yes, we do want to allow interposing on these virtuals.

> Does the following patch seems sane?

Easier to review the change in semantics of a sample bit of code…  I think I 
understand the effects of the change.

> Of course I would be happier with a stronger rule - for instance allowing
> interposition only on plain functions not on methods.

Hum, I like the orthogonal rules that apply generally.  Meaning, I don't like 
the notion of treating functions and methods (or virtual methods) differently.  
For example, a don't peer into for a template function definition, should be 
used to not peer into a normal inline function.

I think I like letting the optimizer do anything, and making the user 
responsible for not using -flto, or ensuring enough separate compilation, or 
otherwise marking the boundaries that don't want to peer though…  I could also 
be burned alive by a linux distributor with existing code if I tried this…  :-) 
 Good luck.


Oh, so keep in mind, if you do something like

template <class T>
class actor {
ctor() { 
        static int i = 100;
        printf("%p\n", &i);
}
};

and don't smash all the ctors together, you (can) wind up with multiple ctor::i 
objects.  The standard says there is one of them.  The usual way to ensure 
there is only one of them is to collapse all the the ctors together into one, 
then, trivially, that one can only reference one of the i's that exist.  This 
is one way (beyond equality) to tell if there are multiples (if the function is 
replicated).  Trying to think if there were any other ways…  ah yes, here it is:

6 A static local variable in a member function always refers to the same
  object, whether or not the member function is inline.

  A  static local variable in an extern inline function always refers to
  the same object.  A string literal in an extern inline function is the
  same object in different translation units.

So, string literals can also be used to notice the uniqueness of the function 
(method).  Curious, we didn't do the same for string literals in an member 
function, not sure why, feels like an oversight, not on purpose.  I'd have to 
dig through all the papers to find when it went in, and the paper that brought 
it in to see.  I don't recall that we talked about making it different.

Re: ELF interposition and One Definition Rule

Reply via email to