labath added a comment.

In D118812#3291954 <https://reviews.llvm.org/D118812#3291954>, @JDevlieghere 
wrote:

> In D118812#3291482 <https://reviews.llvm.org/D118812#3291482>, @dblaikie 
> wrote:
>
>> In D118812#3291303 <https://reviews.llvm.org/D118812#3291303>, @jingham 
>> wrote:
>>
>>> In D118812#3291109 <https://reviews.llvm.org/D118812#3291109>, @dblaikie 
>>> wrote:
>>>
>>>> Any chance you might want a limit on the size of the demangled name too? 
>>>> (might be worth considering what the most densely encoded mangled name is 
>>>> (ie: what's the longest name that could be produced by a 10k long mangled 
>>>> name? and see if that's worth having another cutoff for)
>>>
>>> Ironically, lldb seldom cares about most of the goo in these long demangled 
>>> names.  At this point, we are building up our fast-lookup "name indexes".  
>>> We really only care about extracting the fully scoped names of the methods. 
>>>  When we get around to doing smart matching on overloads, we can still pull 
>>> out all the matches to the method name, and then do the overload match on 
>>> the results.  That should be sufficiently efficient, and obviate the need 
>>> to do any fancy indexing based on overloads.  So most of the work of 
>>> demangling these names is not being used anyway.
>>>
>>> So what would be the better solution for lldb on the demangling front would 
>>> be a way to tell the demangler "only extract the full method name, and 
>>> don't bother producing the argument list or return values".  But I have no 
>>> idea how easy that would be in the demangler.
>>
>> I think there's an API level of the demangler in LLVM designed for rewriting 
>> demangled names (@rsmith created/implemented that, I think) - I'm not sure 
>> if it's structured to allow lazy parsing/stopping after you get the base 
>> name, for instance, but maybe...
>
> We should definitely look into that as a general optimization for indexing 
> the string table and would make sense in combination with D118814 
> <https://reviews.llvm.org/D118814>. For this particular patch, we're trying 
> to avoid demangling at all if the symbol is too long, so unless a partial 
> demangle is really cheap (it might be) we'd still want to exclude symbols 
> based on their mangled length.

The most expensive step in demangling is the actual construction of the 
demangled string. It's fairly easy to make that exponential (because the the 
output string can be exponentially larger than the input). The construction of 
AST (well, a kind of a DAG actually), should always be linear.

And extracting the name this way will also save us from having to another parse 
of the demangled name (to extract the base name), so it's double goodness. I 
don't think the actual extraction should be that hard. The trickiest part is 
understanding the way in which the name are encoded so that you know what to 
look for.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118812/new/

https://reviews.llvm.org/D118812

_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to