jmorse wrote:

[This keeps on slipping to the back of my TODO list,]

I've been enlightened by the comments on #68929 about ODR rules, and that there 
isn't a violation in the example; it does at least exercise the code path of 
interest, my summary of which is that the ODR-uniquing of types is happening at 
such a low of a level that it causes surprises and can't be easily fixed. 
Here's a more well designed reproducer:

    inline int foo() {
      class bar {
      private:
        int a = 0;
      public:
        int get_a() { return a; }
      };

      static bar baz;
      return baz.get_a();
    }

    int a() {
      return foo();
    }

Compile and link this similar to above:
    clang a.cpp  -o b.ll -emit-llvm -S -g -c -O2
    clang b.cpp  -o b.ll -emit-llvm -S -g -c -O2
    llvm-link a.ll b.ll -o c.ll -S
    llc c.ll -o out.o -filetype=obj
    <boom>

Where b.cpp is a copy of the file above with the function renamed from 'a' to 
'b' to ensure there aren't multiple conflicting definitions. In this code, we 
inline the body of "foo" into the 'a' and 'b' functions, and furthermore we 
inline the get_a method of foo::bar too. In each of the compilation units, this 
leads to a chain of lexical scopes for the most deeply inlined instruction of:
 * get_a method,
 * foo::bar class
 * foo function
 * 'a' or 'b' function.

The trouble comes when the two modules are linked together: the two collections 
of DILocations / DILexicalScopes / DISubprograms describing source-locations in 
each module are distinct and kept separate through linking. However the 
DICompositeType for foo::bar is unique'd based on its name, and its "scope" 
field will point into one of the metadata collections. Thus, where we used to 
have two distinct chains of lexical scopes we've now got a tree of them, 
joining at the unique'd DICompositeType, and llc is not prepared for this.

I don't know that this is a bug, more of a design mismatch: most of the rest of 
LLVM is probably OK with having the lexical-scope chain actually being a tree, 
given that it only ever looks up it. However LexicalScopes does a top down 
exploration of a function looking for lexical scopes, and is then surprised 
when it finds different scopes looking from the bottom up. We could adjust it 
to search upwards for more lexical scopes (it already does that for 
block-scopes), but then I imagine we would produce very confusing DWARF that 
contained two Subprogram scopes for the same function.

There's also no easy way of working around this in metadata: we can't describe 
any other metadata relationship because it's done at such a low level, and we 
can't selectively not-ODR-unique DICompositeTypes that are inside functions 
because the lexical scope metadata might not have been loaded yet, so can't be 
examined.

An immediate fix would be to not set the "identifier" field for the 
DICompositeType when it's created if it's inside a function scope to avoid 
ODRUniqing. I've only got a light understanding of what the identifier field is 
for, so there might be unexpected consequences, plus there'll be a 
metadata/DWARF size cost to that.

https://github.com/llvm/llvm-project/pull/75385
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to