tmsriram added a comment. In D93747#2481494 <https://reviews.llvm.org/D93747#2481494>, @dblaikie wrote:
> In D93747#2481383 <https://reviews.llvm.org/D93747#2481383>, @hoy wrote: > >> In D93747#2481304 <https://reviews.llvm.org/D93747#2481304>, @tmsriram wrote: >> >>> In D93747#2481223 <https://reviews.llvm.org/D93747#2481223>, @tmsriram >>> wrote: >>> >>>> In D93747#2481163 <https://reviews.llvm.org/D93747#2481163>, @dblaikie >>>> wrote: >>>> >>>>> In D93747#2481095 <https://reviews.llvm.org/D93747#2481095>, @hoy wrote: >>>>> >>>>>> In D93747#2481073 <https://reviews.llvm.org/D93747#2481073>, @dblaikie >>>>>> wrote: >>>>>> >>>>>>> Suggesting that gdb isn't using the DW_AT_name at all for "break >>>>>>> <function name>" but instead demangling and stripping off the extras >>>>>>> from the linkage name, and since it can't demangle this uniquified name >>>>>>> (unlike the mangled name used when using the overloadable attribute) >>>>>>> that degrades the debugger user experience? I'd have to test that in >>>>>>> more detail/with some hand-hacked DWARF. >>>>>> >>>>>> Yes, I think in your case the linage name can be demangled by the >>>>>> debugger. In my previous experiment, the uniquefied names could not be >>>>>> demangled therefore I was not able to breakpoint. >>>>> >>>>> Ah, did some more testing. Seems it's because it isn't a classically >>>>> mangled name. >>> >>> The simplest fix I can think of is to emit the base 10 number of the >>> md5hash. That would preserve all the existing uniqueness and be demangler >>> friendly. @hoy @dblaikie WDYT? >> >> I think using the base 10 form of md5hash is a smart workaround. Thanks for >> the suggestion. > > Sure - wouldn't mind having some wording from the Itanium ABI/mangling rules > that explain/corroborate what we've seen from testing to ensure we are using > it correctly and there aren't any other ways that might be more suitable, or > lurking gotchas we haven't tested. Yes, makes sense. Also, like @dblaikie pointed out in D94154 <https://reviews.llvm.org/D94154> it is now important that we don't generate the DW_AT_linkage_name for C style names using this suffix as it will not demangle. I think this makes it important that we only update this field if it already exists. >>>>> It might be possible for this uniquifying step to check if the name is >>>>> mangled (does it start with "_Z") and if it isn't mangled, it could >>>>> mangle it (check the length of the name, then "_Z<length><name>v") and >>>>> add the unique suffix. That looks to me like it preserves the original >>>>> debugging behavior? >>>>> >>>>> (has the problem that we don't actually know the mangling scheme at this >>>>> point - we do know it up in clang (where it might be Itanium mangling or >>>>> Microsoft mangling), might point again to the possibility this feature is >>>>> more suitable to implement in the frontend rather than a middle end pass. >>>>> And also the 'v' in the mangling is 'void return type', which is a lie - >>>>> not sure if we could do something better there) >> >> Doing name unification in the frontend sounds like the ultimate solution and >> since the frontend has all the knowledge about the name mangler. I think for >> now we can go with the solution @tmsriram suggested, i.e., using the base 10 >> form of md5 hash. > > Any general idea of how long "for now" would be? It doesn't seem like putting > this off is going to make things especially better & seems like more bug > fixes/changes/etc are being built around the solution as it is at the moment. > So I'd be hesitant to put off this kind of restructuring too long. > >>>>> I think it's pretty important that we don't break tradeoff debuggability >>>>> for profile accuracy. It'll make for a difficult tradeoff for our/any >>>>> users. >>>> >>>> Agreed, I didn't expect this. >>>> >>>>> (side note: using the original source file name seems problematic - I >>>>> know in google at least, the same source file is sometimes built into >>>>> more than one library/form with different preprocessor defines, so this >>>>> may not produce as unique a name as you are expecting?) >>>> >>>> It was a best effort and I think the hashing can be improved by using more >>>> signals other than just the module name if needed. For hard data though, >>>> this does significantly improve performance which clearly comes from >>>> better profile attribution so it does something. > > I'm OK with the idea that this helped the situation - but it doesn't seem > very principled/rigorous. Rather than a sliding scale of "better" I think > it'd be worth considering in more detail what would be required to make this > correct. > > Using the object file name may be more robust - also not perfect for all > build systems (one could imagine a distributed build system that /might/ be > able to get away with having the distributed builders always build a file > named x.o - only to have the distribution system rename the file to its > canonical name on the main machine before linking occurs - but I think that's > not how most distributed build systems work, and most build systems provide a > unique object file name across a build?) but maybe moreso than using the > input file name? (at least I think for google/blaze the object filename is > genuinely unique) So we are using the full path name of the source. We could also combine the object file name into the hash to make it more robust. I can work on this patch independently if that is alright. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D93747/new/ https://reviews.llvm.org/D93747 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits