hoy added inline comments.
================ Comment at: clang/test/CodeGen/unique-internal-linkage-names-dwarf.c:34-39 +static int go(a) int a; +{ + return glob + a; +} + + ---------------- dblaikie wrote: > aaron.ballman wrote: > > dblaikie wrote: > > > aaron.ballman wrote: > > > > dblaikie wrote: > > > > > hoy wrote: > > > > > > dblaikie wrote: > > > > > > > dblaikie wrote: > > > > > > > > hoy wrote: > > > > > > > > > dblaikie wrote: > > > > > > > > > > hoy wrote: > > > > > > > > > > > dblaikie wrote: > > > > > > > > > > > > hoy wrote: > > > > > > > > > > > > > dblaikie wrote: > > > > > > > > > > > > > > hoy wrote: > > > > > > > > > > > > > > > dblaikie wrote: > > > > > > > > > > > > > > > > hoy wrote: > > > > > > > > > > > > > > > > > dblaikie wrote: > > > > > > > > > > > > > > > > > > hoy wrote: > > > > > > > > > > > > > > > > > > > dblaikie wrote: > > > > > > > > > > > > > > > > > > > > Does this need to be down here? Or > > > > > > > > > > > > > > > > > > > > would the code be a well exercised if > > > > > > > > > > > > > > > > > > > > it was up next to the go declaration > > > > > > > > > > > > > > > > > > > > above? > > > > > > > > > > > > > > > > > > > Yes, it needs to be here. Otherwise it > > > > > > > > > > > > > > > > > > > will just like the function `bar` above > > > > > > > > > > > > > > > > > > > that doesn't get a uniquefied name. I > > > > > > > > > > > > > > > > > > > think moving the definition up to right > > > > > > > > > > > > > > > > > > > after the declaration hides the > > > > > > > > > > > > > > > > > > > declaration. > > > > > > > > > > > > > > > > > > Not sure I follow - do you mean that if the > > > > > > > > > > > > > > > > > > go declaration and go definition were next > > > > > > > > > > > > > > > > > > to each other, this test would > > > > > > > > > > > > > > > > > > (mechanically speaking) not validate what > > > > > > > > > > > > > > > > > > the patch? Or that it would be less > > > > > > > > > > > > > > > > > > legible, but still mechanically correct? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think it would be (assuming it's still > > > > > > > > > > > > > > > > > > mechanically correct) more legible to put > > > > > > > > > > > > > > > > > > the declaration next to the definition - > > > > > > > > > > > > > > > > > > the comment describes why the declaration > > > > > > > > > > > > > > > > > > is significant/why the definition is weird, > > > > > > > > > > > > > > > > > > and seeing all that together would be > > > > > > > > > > > > > > > > > > clearer to me than spreading it out/having > > > > > > > > > > > > > > > > > > to look further away to see what's going on. > > > > > > > > > > > > > > > > > When the `go` declaration and `go` definition > > > > > > > > > > > > > > > > > were next to each other, the go function > > > > > > > > > > > > > > > > > won't get a uniqufied name at all. The > > > > > > > > > > > > > > > > > declaration will be overwritten by the > > > > > > > > > > > > > > > > > definition. Only when the declaration is seen > > > > > > > > > > > > > > > > > by others, such the callsite in `baz`, the > > > > > > > > > > > > > > > > > declaration makes a difference by having the > > > > > > > > > > > > > > > > > callsite use a uniqufied name. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Ah! Interesting, good to know. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Is that worth supporting, I wonder? I guess it > > > > > > > > > > > > > > > > falls out for free/without significant > > > > > > > > > > > > > > > > additional complexity. I worry about the > > > > > > > > > > > > > > > > subtlety of the additional declaration changing > > > > > > > > > > > > > > > > the behavior here... might be a bit > > > > > > > > > > > > > > > > surprising/subtle. But maybe no nice way to > > > > > > > > > > > > > > > > avoid it either. > > > > > > > > > > > > > > > It would be ideal if user never writes code like > > > > > > > > > > > > > > > that. Unfortunately it exists with legacy code > > > > > > > > > > > > > > > (such as mysql). I think it's worth supporting it > > > > > > > > > > > > > > > from AutoFDO point of view to avoid a silent > > > > > > > > > > > > > > > mismatch between debug linkage name and real > > > > > > > > > > > > > > > linkage name. > > > > > > > > > > > > > > Oh, I agree that we shouldn't mismatch debug info > > > > > > > > > > > > > > and the actual symbol name - what I meant was > > > > > > > > > > > > > > whether code like this should get mangled or not > > > > > > > > > > > > > > when using unique-internal-linkage names. > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm now more curious about this: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > When the `go` declaration and `go` definition > > > > > > > > > > > > > > > were next to each other, the go function won't > > > > > > > > > > > > > > > get a uniqufied name at all. > > > > > > > > > > > > > > > > > > > > > > > > > > > > This doesn't seem to happen with the > > > > > > > > > > > > > > `__attribute__((overloadable))` attribute, for > > > > > > > > > > > > > > instance - so any idea what's different about > > > > > > > > > > > > > > uniquification that's working differently than > > > > > > > > > > > > > > overloadable? > > > > > > > > > > > > > > > > > > > > > > > > > > > > ``` > > > > > > > > > > > > > > $ cat test.c > > > > > > > > > > > > > > __attribute__((overloadable)) static int go(a) int > > > > > > > > > > > > > > a; { > > > > > > > > > > > > > > return 3 + a; > > > > > > > > > > > > > > } > > > > > > > > > > > > > > void baz() { > > > > > > > > > > > > > > go(2); > > > > > > > > > > > > > > } > > > > > > > > > > > > > > $ clang-tot test.c -emit-llvm -S -o - | grep go > > > > > > > > > > > > > > %call = call i32 @_ZL2goi(i32 2) > > > > > > > > > > > > > > define internal i32 @_ZL2goi(i32 %a) #0 { > > > > > > > > > > > > > > ``` > > > > > > > > > > > > > Good question. I'm not sure what's exactly going on > > > > > > > > > > > > > but it looks like with the overloadable attribute, > > > > > > > > > > > > > the old-style definition is treated as having > > > > > > > > > > > > > prototype. But if you do this: > > > > > > > > > > > > > > > > > > > > > > > > > > ``` > > > > > > > > > > > > > __attribute__((overloadable)) > > > > > > > > > > > > > void baz() {} > > > > > > > > > > > > > ``` > > > > > > > > > > > > > then there's the error: > > > > > > > > > > > > > > > > > > > > > > > > > > ``` > > > > > > > > > > > > > error: 'overloadable' function 'baz' must have a > > > > > > > > > > > > > prototype > > > > > > > > > > > > > void baz() { > > > > > > > > > > > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > `void baz() {` does not come with a prototype. That's > > > > > > > > > > > > > for sure. Sounds like `int go(a) int a {;` can have > > > > > > > > > > > > > a prototype when it is loadable. I'm wondering why > > > > > > > > > > > > > it's not always treated as having prototype, since > > > > > > > > > > > > > the parameter type is there. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yeah, that seems like that divergence be worth > > > > > > > > > > > > understanding (& if possible fixing/avoiding/merging). > > > > > > > > > > > > Ensuring these features don't have subtle divergence I > > > > > > > > > > > > think will be valuable to having a model that's easy to > > > > > > > > > > > > explain/understand/modify/etc. > > > > > > > > > > > I took another look. I think the divergence comes from > > > > > > > > > > > `getAs<FunctionProtoType>` vs `hasPrototype`. The debug > > > > > > > > > > > data generation uses `hasPrototype` while > > > > > > > > > > > `getAs<FunctionProtoType>` is used as overloadable > > > > > > > > > > > attribute processing as long as unique linkage name > > > > > > > > > > > processing before this change. More specifically, the > > > > > > > > > > > following function definition is represented by > > > > > > > > > > > `FunctionProtoType` while it does not `hasPrototype`. > > > > > > > > > > > > > > > > > > > > > > ``` > > > > > > > > > > > static int go(a) int a; { > > > > > > > > > > > return 3 + a; > > > > > > > > > > > } > > > > > > > > > > > ``` > > > > > > > > > > > > > > > > > > > > > > I was trying to have `CGDebugInfo` to check > > > > > > > > > > > `FunctionProtoType` instead of `hasPrototype`. While it > > > > > > > > > > > works for the code pattern in discussion, it also breaks > > > > > > > > > > > other tests including objectC tests. More investigation > > > > > > > > > > > is needed to understand what each term really means. > > > > > > > > > > Are you undertaking that investigation? It'd be good to > > > > > > > > > > address this divergence if possible. > > > > > > > > > > > > > > > > > > > > (@aprantl or @rsmith maybe you know something about this > > > > > > > > > > ObjC thing? ) > > > > > > > > > Haven't figured out anything useful yet. As far as I can > > > > > > > > > tell, the debug info generation code is shared between C++ > > > > > > > > > and ObjC. Using `getAs<FunctionProtoType>` works for C++ but > > > > > > > > > not for ObjectC where it crashes when computing a mangled > > > > > > > > > name for something like > > > > > > > > > > > > > > > > > > > > > > > > > > > ``` > > > > > > > > > void test() { > > > > > > > > > __block A a; > > > > > > > > > ^{ (void)a; }; > > > > > > > > > } > > > > > > > > > > > > > > > > > > ``` > > > > > > > > > > > > > > > > > > Below are the failing tests which are all like that: > > > > > > > > > > > > > > > > > > Clang :: CodeGenCXX/cp-blocks-linetables.cpp > > > > > > > > > Clang :: > > > > > > > > > CodeGenCXX/debug-info-block-invocation-linkage-name.cpp > > > > > > > > > Clang :: CodeGenCXX/debug-info-blocks.cpp > > > > > > > > > Clang :: CodeGenObjCXX/nested-ehlocation.mm > > > > > > > > > Clang :: CodeGenObjCXX/property-objects.mm > > > > > > > > > Clang :: CodeGenObjCXX/synthesized-property-cleanup.mm > > > > > > > > > > > > > > > > > > > > > > > > > > > cc @bruno > > > > > > > > Ping on this - anyone got a chance to take a look? It'd be > > > > > > > > great to avoid this subtle inconsistency. > > > > > > > Ping again > > > > > > I tried a different route instead of using > > > > > > `getAs<FunctionProtoType>` in debug info generation which beaks the > > > > > > blocks function and objectC cases. Since the problem here is that > > > > > > the old-C function (`bar` in the test case) is not considered > > > > > > `hasPrototype`, I tried to unify `isKNRPrototype` and > > > > > > `hasPrototype` so that old-C functions are considered > > > > > > `hasPrototype`. It works for the name mangler but it breaks other > > > > > > places where `isKNRPrototype` should be excluded from > > > > > > `hasPrototype`. > > > > > Hrm - I'd really like to get to the bottom of this, but not sure who > > > > > to pull in. > > > > > > > > > > @aaron.ballman - do you know who might have some idea of how these > > > > > different old KNR C declarations work, and how this code might be > > > > > made more robust? > > > > Ugh, prototypes. They're not particularly well specified in the C > > > > standard IMHO. In C, a function with a prototype is one that declares > > > > the types of its parameters (C17 6.2.1p2) which is further clarified to > > > > be a function type with a parameter type list explicitly (C17 6.2.7p3, > > > > 6.9.1p7). However, the very end of 6.9.1p7 goes on to say that once you > > > > see the definition of the function, you know about its parameter type > > > > information, but it doesn't clarify whether this means the function now > > > > has a prototype or not. > > > > > > > > The result of this is that: > > > > ``` > > > > void f(); > > > > void call_it_once(void) { f(1.2f); } > > > > > > > > void f(a) float a; {} > > > > void call_if_twice(void) { f(1.2f); } > > > > ``` > > > > in `call_it_once`, the argument is promoted to a double, while in > > > > `call_it_twice`, the argument is not. I suspect we're hitting another > > > > variation of this confusing behavior here. > > > > > > > @aaron.ballman Thanks for taking a look. > > > > > > Do you know if/how this code could be phrased so that this code: > > > ``` > > > static int go(int); > > > > > > void baz() { > > > foo(); > > > bar(1); > > > go(2); > > > } > > > > > > static int go(a) int a; > > > { > > > return glob + a; > > > } > > > ``` > > > > > > Could test some property of `go` at the function call site that would be > > > consistent whether the definition of `go` came before or after the call > > > site? It seems currently this code behaves differently depending on that > > > order and I think that's a bit of a sharp corner it'd be good not to > > > have, if there's a tidier/more consistent way to phrase the code. > > To be honest, I wasn't aware this code was even valid (where you mix and > > match between identifier lists and parameter type lists), so that's neat. > > > > I might be confused, but in my tests, the behavior of > > `collectFunctionDeclProps()` is the same regardless of order, but the calls > > to `isUniqueInternalLinkageDecl()` are different. Given an invocation of: > > `-cc1 -triple x86_64-unknown-linux -debug-info-kind=limited > > -dwarf-version=4 -funique-internal-linkage-names -emit-llvm -o - test.c` > > with test.c as: > > ``` > > static int go(int); > > > > static int go(a) int a; > > { > > return 1 + a; > > } > > > > void baz() { > > go(2); > > } > > ``` > > I see `isUniqueInternalLinkageDecl()` gets called: > > * once for `go` with no prototype > > * once for `baz` with no prototype > > > > and `collectFunctionDeclProps()` gets called: > > * once for `baz` with no prototype > > * once for `go` with a prototype > > > > The end result is that I see `!13 = distinct !DISubprogram(name: "go", > > scope: !8, file: !8, line: 3, type: !14, scopeLine: 4, flags: > > DIFlagPrototyped, spFlags: DISPFlagLocalToUnit | DISPFlagDefinition, unit: > > !0, retainedNodes: !2)` in the output. > > > > However, with test.c as: > > ``` > > static int go(int); > > > > void baz() { > > go(2); > > } > > > > static int go(a) int a; > > { > > return 1 + a; > > } > > ``` > > I see `isUniqueInternalLinkageDecl()` gets called: > > * once for `baz` with no prototype > > * once for `go` with a prototype > > * another one for `go` with a prototype > > > > and `collectFunctionDeclProps()` gets called: > > * once for `baz` with no prototype > > * once for `go` with a prototype > > > > The end result is that I see `!13 = distinct !DISubprogram(name: "go", > > linkageName: "_ZL2goi.__uniq.39558841650144213141281977295187289852", > > scope: !8, file: !8, line: 7, type: !14, scopeLine: 8, flags: > > DIFlagPrototyped, spFlags: DISPFlagLocalToUnit | DISPFlagDefinition, unit: > > !0, retainedNodes: !2)` > > > > When I change `isUniqueInternalLinkageDecl()` to use > > `!FD->getCanonicalDecl()->hasPrototype()`, I get the same behavior with > > either ordering. When I run the full test suite with that change, I get no > > test failures, so that may be a reasonable fix worth investigating (I'm not > > super familiar with the ins and outs of name mangling and whether this > > change would be correct or not). > Thanks so much, @aaron.ballman that does sound exactly like it summarizes the > situation and the suggestion sounds like it could be the right thing. > > @hoy does that all make sense to you/could you try @aaron.ballman's suggested > change/fix? Thanks for doing the experiment @aaron.ballman . Actually for the test in this diff, we would like the function `bar` to have a unique linkage name. The suggested change doesn't seem to fix that. ``` // bar should not be given a uniquefied name under -funique-internal-linkage-names, // since it doesn't come with valid prototype. static int bar(a) int a; { return glob + a; } ``` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D98799/new/ https://reviews.llvm.org/D98799 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits