On Fri, Jan 26, 2018 at 8:38 AM, Erik Pilkington via lldb-dev <lldb-dev@lists.llvm.org> wrote: > > > On 2018-01-25 1:58 PM, Greg Clayton wrote: >>> >>> On Jan 25, 2018, at 10:25 AM, Erik Pilkington <erik.pilking...@gmail.com> >>> wrote: >>> >>> Hi, >>> I'm not at all familiar with LLDB, but I've been doing some work on the >>> demangler in libcxxabi. It's still a work in progress and I haven't yet >>> copied the changes over to ItaniumDemangle, which AFAIK is what lldb uses. >>> The demangler in libcxxabi now demangles the symbol you attached in 3.31 >>> seconds, instead of 223.54 on my machine. I posted a RFC on my work here >>> (http://lists.llvm.org/pipermail/llvm-dev/2017-June/114448.html), but >>> basically the new demangler just produces an AST then traverses it to print >>> the demangled name. >> >> Great to hear the huge speedup in demangling! LLDB actually has two >> demanglers: a fast one that can demangle 99% of names, and we fall back to >> ItaniumDemangle which can do all names but is really slow. It would be fun >> to compare your new demangler with the fast one and see if we can get rid of >> the fast demangler now. >>> >>> >>> I think a good way of making this even faster is to have LLDB consume the >>> AST the demangler produces directly. The AST is a better representation of >>> the information that LLDB wants, and finishing the demangle and then fishing >>> out that information from the output string is unfortunate. From the AST, it >>> would be really straightforward to just individually print all the >>> components of the name that LLDB wants. >> >> This would help us to grab the important bits out of the mangled name as >> well. We chop up a demangled name to find the base name (string for >> std::string), containing context (std:: for std::string) and we check if we >> can tell if the function is a method (look for trailing "const" modifier on >> the function) versus a top level function (since the mangling doesn't fully >> specify what is a namespace and what is a class (like in "foo::bar::baz()" >> we don't know if "foo" or "bar" are classes or namespaces. So the AST would >> be great as long as it is fast. >> >>> Most of the time it takes to demangle these "symbols from hell" is during >>> the printing, after the AST has been parsed, because the demangler has to >>> flatten out all the potentially nested back references. Just parsing to an >>> AST should be about proportional to the strlen of the mangled name. Since >>> (AFAIK) LLDB doesn't use some sections of the demangled name often (such as >>> parameters), from the AST LLDB could lazily decide not to even bother fully >>> demangling some sections of the name, then if it ever needs them it could >>> parse a new AST and get them from there. I think this would largely fix the >>> issue, as most of the time these crazy expansions don't occur in the name >>> itself, but in the parameters or return type. Even when they do appear in >>> the name, it would be possible to do some simple name classification (ie, >>> does this symbol refer to a function) or pull out the basename quickly >>> without expanding anything at all. >>> >>> Any thoughts? I'm really not at all familiar with LLDB, so I could have >>> this all wrong! >> >> AST sounds great. We can put this into the class we use to chop us C++ >> names as that is really our goal. >> >> So it would be great to do a speed comparison between our fast demangler >> in LLDB (in FastDemangle.cpp/.h) and your updated libcxxabi version. If >> yours is faster, remove FastDemangle and then update the >> llvm::ItaniumDemangle() to use your new code. >> >> ASTs would be great for the C++ name parser, >> >> Let us know what you are thinking, > > > Hi Greg, > > I'll almost finished with my work on the demangler, hopefully I'll be done > within a few weeks. Once that's all finished I'll look into exporting the > AST and comparing it to FastDemangle. I was thinking about adding a version > of llvm::itaniumMangle() that returns a opaque handle to the AST and > defining some functions on the LLVM side that take that handle and return > some extra information. I'd be happy to help out with the LLDB side of > things too, although it might be better if someone more experienced with > LLDB did this. >
That's great to hear. Not having 3 different demanglers scattered between lldb and llvm will be a big win for everybody. -- Davide _______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev