dblaikie added a comment. In D93747#2481383 <https://reviews.llvm.org/D93747#2481383>, @hoy wrote:
> In D93747#2481304 <https://reviews.llvm.org/D93747#2481304>, @tmsriram wrote: > >> In D93747#2481223 <https://reviews.llvm.org/D93747#2481223>, @tmsriram wrote: >> >>> In D93747#2481163 <https://reviews.llvm.org/D93747#2481163>, @dblaikie >>> wrote: >>> >>>> In D93747#2481095 <https://reviews.llvm.org/D93747#2481095>, @hoy wrote: >>>> >>>>> In D93747#2481073 <https://reviews.llvm.org/D93747#2481073>, @dblaikie >>>>> wrote: >>>>> >>>>>> Suggesting that gdb isn't using the DW_AT_name at all for "break >>>>>> <function name>" but instead demangling and stripping off the extras >>>>>> from the linkage name, and since it can't demangle this uniquified name >>>>>> (unlike the mangled name used when using the overloadable attribute) >>>>>> that degrades the debugger user experience? I'd have to test that in >>>>>> more detail/with some hand-hacked DWARF. >>>>> >>>>> Yes, I think in your case the linage name can be demangled by the >>>>> debugger. In my previous experiment, the uniquefied names could not be >>>>> demangled therefore I was not able to breakpoint. >>>> >>>> Ah, did some more testing. Seems it's because it isn't a classically >>>> mangled name. >> >> The simplest fix I can think of is to emit the base 10 number of the >> md5hash. That would preserve all the existing uniqueness and be demangler >> friendly. @hoy @dblaikie WDYT? > > I think using the base 10 form of md5hash is a smart workaround. Thanks for > the suggestion. Sure - wouldn't mind having some wording from the Itanium ABI/mangling rules that explain/corroborate what we've seen from testing to ensure we are using it correctly and there aren't any other ways that might be more suitable, or lurking gotchas we haven't tested. >>>> It might be possible for this uniquifying step to check if the name is >>>> mangled (does it start with "_Z") and if it isn't mangled, it could mangle >>>> it (check the length of the name, then "_Z<length><name>v") and add the >>>> unique suffix. That looks to me like it preserves the original debugging >>>> behavior? >>>> >>>> (has the problem that we don't actually know the mangling scheme at this >>>> point - we do know it up in clang (where it might be Itanium mangling or >>>> Microsoft mangling), might point again to the possibility this feature is >>>> more suitable to implement in the frontend rather than a middle end pass. >>>> And also the 'v' in the mangling is 'void return type', which is a lie - >>>> not sure if we could do something better there) > > Doing name unification in the frontend sounds like the ultimate solution and > since the frontend has all the knowledge about the name mangler. I think for > now we can go with the solution @tmsriram suggested, i.e., using the base 10 > form of md5 hash. Any general idea of how long "for now" would be? It doesn't seem like putting this off is going to make things especially better & seems like more bug fixes/changes/etc are being built around the solution as it is at the moment. So I'd be hesitant to put off this kind of restructuring too long. >>>> I think it's pretty important that we don't break tradeoff debuggability >>>> for profile accuracy. It'll make for a difficult tradeoff for our/any >>>> users. >>> >>> Agreed, I didn't expect this. >>> >>>> (side note: using the original source file name seems problematic - I know >>>> in google at least, the same source file is sometimes built into more than >>>> one library/form with different preprocessor defines, so this may not >>>> produce as unique a name as you are expecting?) >>> >>> It was a best effort and I think the hashing can be improved by using more >>> signals other than just the module name if needed. For hard data though, >>> this does significantly improve performance which clearly comes from better >>> profile attribution so it does something. I'm OK with the idea that this helped the situation - but it doesn't seem very principled/rigorous. Rather than a sliding scale of "better" I think it'd be worth considering in more detail what would be required to make this correct. Using the object file name may be more robust - also not perfect for all build systems (one could imagine a distributed build system that /might/ be able to get away with having the distributed builders always build a file named x.o - only to have the distribution system rename the file to its canonical name on the main machine before linking occurs - but I think that's not how most distributed build systems work, and most build systems provide a unique object file name across a build?) but maybe moreso than using the input file name? (at least I think for google/blaze the object filename is genuinely unique) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D93747/new/ https://reviews.llvm.org/D93747 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits