tmsriram added a comment.

In D93747#2481494 <https://reviews.llvm.org/D93747#2481494>, @dblaikie wrote:

> In D93747#2481383 <https://reviews.llvm.org/D93747#2481383>, @hoy wrote:
>
>> In D93747#2481304 <https://reviews.llvm.org/D93747#2481304>, @tmsriram wrote:
>>
>>> In D93747#2481223 <https://reviews.llvm.org/D93747#2481223>, @tmsriram 
>>> wrote:
>>>
>>>> In D93747#2481163 <https://reviews.llvm.org/D93747#2481163>, @dblaikie 
>>>> wrote:
>>>>
>>>>> In D93747#2481095 <https://reviews.llvm.org/D93747#2481095>, @hoy wrote:
>>>>>
>>>>>> In D93747#2481073 <https://reviews.llvm.org/D93747#2481073>, @dblaikie 
>>>>>> wrote:
>>>>>>
>>>>>>> Suggesting that gdb isn't using the DW_AT_name at all for "break 
>>>>>>> <function name>" but instead demangling and stripping off the extras 
>>>>>>> from the linkage name, and since it can't demangle this uniquified name 
>>>>>>> (unlike the mangled name used when using the overloadable attribute) 
>>>>>>> that degrades the debugger user experience? I'd have to test that in 
>>>>>>> more detail/with some hand-hacked DWARF.
>>>>>>
>>>>>> Yes, I think in your case the linage name can be demangled by the 
>>>>>> debugger. In my previous experiment, the uniquefied names could not be 
>>>>>> demangled therefore I was not able to breakpoint.
>>>>>
>>>>> Ah, did some more testing. Seems it's because it isn't a classically 
>>>>> mangled name.
>>>
>>> The simplest fix I can think of is to emit the base 10 number of the 
>>> md5hash.  That would preserve all the existing uniqueness and be demangler 
>>> friendly.  @hoy @dblaikie  WDYT?
>>
>> I think using the base 10 form of md5hash is a smart workaround. Thanks for 
>> the suggestion.
>
> Sure - wouldn't mind having some wording from the Itanium ABI/mangling rules 
> that explain/corroborate what we've seen from testing to ensure we are using 
> it correctly and there aren't any other ways that might be more suitable, or 
> lurking gotchas we haven't tested.

Yes, makes sense. Also, like @dblaikie  pointed out in D94154 
<https://reviews.llvm.org/D94154> it is now important that we don't generate 
the DW_AT_linkage_name for C style names using this suffix as it will not 
demangle.  I think this makes it important that we only update this field if it 
already exists.

>>>>> It might be possible for this uniquifying step to check if the name is 
>>>>> mangled (does it start with "_Z") and if it isn't mangled, it could 
>>>>> mangle it (check the length of the name, then "_Z<length><name>v") and 
>>>>> add the unique suffix. That looks to me like it preserves the original 
>>>>> debugging behavior?
>>>>>
>>>>> (has the problem that we don't actually know the mangling scheme at this 
>>>>> point - we do know it up in clang (where it might be Itanium mangling or 
>>>>> Microsoft mangling), might point again to the possibility this feature is 
>>>>> more suitable to implement in the frontend rather than a middle end pass. 
>>>>> And also the 'v' in the mangling is 'void return type', which is a lie - 
>>>>> not sure if we could do something better there)
>>
>> Doing name unification in the frontend sounds like the ultimate solution and 
>> since the frontend has all the knowledge about the name mangler. I think for 
>> now we can go with the solution @tmsriram suggested, i.e., using the base 10 
>> form of md5 hash.
>
> Any general idea of how long "for now" would be? It doesn't seem like putting 
> this off is going to make things especially better & seems like more bug 
> fixes/changes/etc are being built around the solution as it is at the moment. 
> So I'd be hesitant to put off this kind of restructuring too long.
>
>>>>> I think it's pretty important that we don't break tradeoff debuggability 
>>>>> for profile accuracy. It'll make for a difficult tradeoff for our/any 
>>>>> users.
>>>>
>>>> Agreed, I didn't expect this.
>>>>
>>>>> (side note: using the original source file name seems problematic - I 
>>>>> know in google at least, the same source file is sometimes built into 
>>>>> more than one library/form with different preprocessor defines, so this 
>>>>> may not produce as unique a name as you are expecting?)
>>>>
>>>> It was a best effort and I think the hashing can be improved by using more 
>>>> signals other than just the module name if needed.  For hard data though, 
>>>> this does significantly improve performance which clearly comes from 
>>>> better profile attribution so it does something.
>
> I'm OK with the idea that this helped the situation - but it doesn't seem 
> very principled/rigorous. Rather than a sliding scale of "better" I think 
> it'd be worth considering in more detail what would be required to make this 
> correct.
>
> Using the object file name may be more robust - also not perfect for all 
> build systems (one could imagine a distributed build system that /might/ be 
> able to get away with having the distributed builders always build a file 
> named x.o - only to have the distribution system rename the file to its 
> canonical name on the main machine before linking occurs - but I think that's 
> not how most distributed build systems work, and most build systems provide a 
> unique object file name across a build?) but maybe moreso than using the 
> input file name? (at least I think for google/blaze the object filename is 
> genuinely unique)

So we are using the full path name of the source.  We could also combine the 
object file name into the hash to make it more robust. I can work on this patch 
independently if that is alright.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93747/new/

https://reviews.llvm.org/D93747

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to