hoy added a comment. In D93747#2482730 <https://reviews.llvm.org/D93747#2482730>, @tmsriram wrote:
> In D93747#2482726 <https://reviews.llvm.org/D93747#2482726>, @hoy wrote: > >> In D93747#2482519 <https://reviews.llvm.org/D93747#2482519>, @tmsriram wrote: >> >>> In D93747#2481494 <https://reviews.llvm.org/D93747#2481494>, @dblaikie >>> wrote: >>> >>>> In D93747#2481383 <https://reviews.llvm.org/D93747#2481383>, @hoy wrote: >>>> >>>>> In D93747#2481304 <https://reviews.llvm.org/D93747#2481304>, @tmsriram >>>>> wrote: >>>>> >>>>>> In D93747#2481223 <https://reviews.llvm.org/D93747#2481223>, @tmsriram >>>>>> wrote: >>>>>> >>>>>>> In D93747#2481163 <https://reviews.llvm.org/D93747#2481163>, @dblaikie >>>>>>> wrote: >>>>>>> >>>>>>>> In D93747#2481095 <https://reviews.llvm.org/D93747#2481095>, @hoy >>>>>>>> wrote: >>>>>>>> >>>>>>>>> In D93747#2481073 <https://reviews.llvm.org/D93747#2481073>, >>>>>>>>> @dblaikie wrote: >>>>>>>>> >>>>>>>>>> Suggesting that gdb isn't using the DW_AT_name at all for "break >>>>>>>>>> <function name>" but instead demangling and stripping off the extras >>>>>>>>>> from the linkage name, and since it can't demangle this uniquified >>>>>>>>>> name (unlike the mangled name used when using the overloadable >>>>>>>>>> attribute) that degrades the debugger user experience? I'd have to >>>>>>>>>> test that in more detail/with some hand-hacked DWARF. >>>>>>>>> >>>>>>>>> Yes, I think in your case the linage name can be demangled by the >>>>>>>>> debugger. In my previous experiment, the uniquefied names could not >>>>>>>>> be demangled therefore I was not able to breakpoint. >>>>>>>> >>>>>>>> Ah, did some more testing. Seems it's because it isn't a classically >>>>>>>> mangled name. >>>>>> >>>>>> The simplest fix I can think of is to emit the base 10 number of the >>>>>> md5hash. That would preserve all the existing uniqueness and be >>>>>> demangler friendly. @hoy @dblaikie WDYT? >>>>> >>>>> I think using the base 10 form of md5hash is a smart workaround. Thanks >>>>> for the suggestion. >>>> >>>> Sure - wouldn't mind having some wording from the Itanium ABI/mangling >>>> rules that explain/corroborate what we've seen from testing to ensure we >>>> are using it correctly and there aren't any other ways that might be more >>>> suitable, or lurking gotchas we haven't tested. >>> >>> Yes, makes sense. Also, like @dblaikie pointed out in D94154 >>> <https://reviews.llvm.org/D94154> it is now important that we don't >>> generate the DW_AT_linkage_name for C style names using this suffix as it >>> will not demangle. I think this makes it important that we only update >>> this field if it already exists. >> >> Yes, it makes sense to do the renaming only if the linkage name preexists to >> not break the debuggability. Unfortunately we won't be able to support C >> with this patch. > > There is nothing needed to support C right? overloadable attribute will > mangle with _Z so it will work as expected? What did I miss? I mean from the AutoFDO point of view, C programs with static functions are not getting the unique debug name support. >>>>>>>> It might be possible for this uniquifying step to check if the name is >>>>>>>> mangled (does it start with "_Z") and if it isn't mangled, it could >>>>>>>> mangle it (check the length of the name, then "_Z<length><name>v") and >>>>>>>> add the unique suffix. That looks to me like it preserves the original >>>>>>>> debugging behavior? >>>>>>>> >>>>>>>> (has the problem that we don't actually know the mangling scheme at >>>>>>>> this point - we do know it up in clang (where it might be Itanium >>>>>>>> mangling or Microsoft mangling), might point again to the possibility >>>>>>>> this feature is more suitable to implement in the frontend rather than >>>>>>>> a middle end pass. And also the 'v' in the mangling is 'void return >>>>>>>> type', which is a lie - not sure if we could do something better there) >>>>> >>>>> Doing name unification in the frontend sounds like the ultimate solution >>>>> and since the frontend has all the knowledge about the name mangler. I >>>>> think for now we can go with the solution @tmsriram suggested, i.e., >>>>> using the base 10 form of md5 hash. >>>> >>>> Any general idea of how long "for now" would be? It doesn't seem like >>>> putting this off is going to make things especially better & seems like >>>> more bug fixes/changes/etc are being built around the solution as it is at >>>> the moment. So I'd be hesitant to put off this kind of restructuring too >>>> long. >> >> I'm not sure. It looks like implementation in the front end has more >> complexity as discussed in the other thread D94154 >> <https://reviews.llvm.org/D94154>. >> >>>>>>>> I think it's pretty important that we don't break tradeoff >>>>>>>> debuggability for profile accuracy. It'll make for a difficult >>>>>>>> tradeoff for our/any users. >>>>>>> >>>>>>> Agreed, I didn't expect this. >>>>>>> >>>>>>>> (side note: using the original source file name seems problematic - I >>>>>>>> know in google at least, the same source file is sometimes built into >>>>>>>> more than one library/form with different preprocessor defines, so >>>>>>>> this may not produce as unique a name as you are expecting?) >>>>>>> >>>>>>> It was a best effort and I think the hashing can be improved by using >>>>>>> more signals other than just the module name if needed. For hard data >>>>>>> though, this does significantly improve performance which clearly comes >>>>>>> from better profile attribution so it does something. >>>> >>>> I'm OK with the idea that this helped the situation - but it doesn't seem >>>> very principled/rigorous. Rather than a sliding scale of "better" I think >>>> it'd be worth considering in more detail what would be required to make >>>> this correct. >>>> >>>> Using the object file name may be more robust - also not perfect for all >>>> build systems (one could imagine a distributed build system that /might/ >>>> be able to get away with having the distributed builders always build a >>>> file named x.o - only to have the distribution system rename the file to >>>> its canonical name on the main machine before linking occurs - but I think >>>> that's not how most distributed build systems work, and most build systems >>>> provide a unique object file name across a build?) but maybe moreso than >>>> using the input file name? (at least I think for google/blaze the object >>>> filename is genuinely unique) >>> >>> So we are using the full path name of the source. We could also combine >>> the object file name into the hash to make it more robust. I can work on >>> this patch independently if that is alright. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D93747/new/ https://reviews.llvm.org/D93747 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits