One more short question, as you seem to be much more familiar with the code than me. Maybe you have an idea?
In the unit test I prepared, I didn't manage to cover this line in Symtab::InitNameIndexes(): https://github.com/llvm-mirror/lldb/blob/master/source/Symbol/Symtab.cpp#L348 With both demangle implementations, the old one and IPD, annotations cause demangling to fail/return empty name. I assume this branch has its right to exist and I would just keep it, but it would also be great to find an explanation and a repro to cover it. Am 25.07.18 um 20:15 schrieb Stefan Gränitz: > Hi Pavel, thanks for your feedback! > >> I would propose to create a new class (RichMangingInfo?), which would >> wrap both ItaniumPartialDemangler and the existing string-chopping >> implementation, and provide a unified interface on top of them. > Right, good point. Adding an abstraction like this certainly makes sense. > >> we might be able >> to get further speedup by having DemangleWithRichNameIndexInfo *not* >> fill out the demangled string (as, I understand, building the actual >> string is the most expensive operation), and just fetching the bits of >> info we need from the IPD. > Yes, if it turns out we really don't need the demangled names in a table > and instead just render them on-demand, I guess we'd gain additional > speedup plus reduction in memory usage. I will have a closer look on this. > >> right now the Symtab class builds a table containing all >> demangled names. This is mostly unnecessary, and > What exact table are you referring to here? The m_name_to_index map does > store both, demangled and mangled strings. Not sure to what extend the > demangled ones are necessary. It's a little hard to judge, but I can try > and check. Or do you mean a different one? > > There is one more follow-up task here: Now that FastDemangle isn't used > anymore in Mangled::GetDemangledName(), its last remaining usage is > SubsPrimitiveParmItanium() in CPlusPlusLanguage.cpp. I will try to > figure out if we still need it or if we can switch it to the IPD too. > Then we could consider to get rid of FastMangle altogether. > > Any objections here? > > > Am 25.07.18 um 13:11 schrieb Pavel Labath: >> Hello Stefan, >> >> thank you for working on this. I believe this work will bring a huge >> improvement to LLDB. >> >> Adding something like DemangleWithRichNameIndexInfo to the Mangled >> class makes sense to me. However, I don't think that function should >> be taking an ItaniumPartialDemangler as an argument. The reason for >> that is that ItaniumPartialDemangler is, unsurprisingly, specific to >> the itanium mangling scheme, and the Mangled class tries to hide the >> specifics of different manglings. >> >> I would propose to create a new class (RichMangingInfo?), which would >> wrap both ItaniumPartialDemangler and the existing string-chopping >> implementation, and provide a unified interface on top of them. >> Internally it would use IPD for the itanium scheme and the legacy >> implementation for the MSVC mangling, but the user would not have to >> care about that, as it can just ask questions like "is this a >> constructor?" and get the correct answer either way. If ever we get >> something similar to IPD for the MSVC mangling scheme, we can just >> replace the legacy implementation with that one, and nobody should >> know the difference. >> >> What do you think? >> >> >> BTW, right now the Symtab class builds a table containing all >> demangled names. This is mostly unnecessary, and in fact we have >> recently removed a similar table for the demangled debug_info names. >> The only present use for it that I am aware of is papering over a >> clang "bug"*. If we are able to get rid of this, then we might be able >> to get further speedup by having DemangleWithRichNameIndexInfo *not* >> fill out the demangled string (as, I understand, building the actual >> string is the most expensive operation), and just fetching the bits of >> info we need from the IPD. >> >> (*) The "bug" is that sometimes clang will not generate the C1 (full >> object) flavour of an constructor if it is identical to the C2 version >> (even though the CU in question definitely constructs full objects of >> the given class). This means that we are not able able to construct >> some objects during expression evaluation as we cannot find the C1 >> version anywhere. The demangling makes this work accidentally, as were >> are able to match up the constructors by demangled names, as they both >> demangle to the same string. I have recently fixed clang to emit C1 >> always (at -O0), but the question remains on what to do with older >> compilers. >> >> >> On Tue, 24 Jul 2018 at 21:55, Stefan Gränitz <stefan.graen...@gmail.com> >> wrote: >>> Hello everyone >>> >>> I am relatively new to the LLDB sources and just submitted my first own >>> patch for review in Phabricator. Based on this patch I would like to >>> discuss a few details for further improvements on LLDB's demangling. >>> >>> First a short recap on the current state: >>> * Name demangling used to be a lazy process, exclusively accessible via >>> Mangled::GetDemangledName() - this is a valuable mechanism: >>> https://github.com/llvm-mirror/lldb/blob/8ba903256fd92a2d8644b108a7c8a1a15efd90ad/source/Core/Mangled.cpp#L252 >>> * My patch wants to replace the existing combination of FastDemangle & >>> itaniumDemangle() with LLVM's new ItaniumPartialDemangler (IPD) >>> implementation and no fallbacks anymore. It slightly reduces complexity >>> and slightly improves performance, but doesn't introduce conceptual >>> changes: https://reviews.llvm.org/D49612 >>> * IPD provides rich information on names, e.g. isFuntion() or >>> isCtorOrDtor(), but stores that in its own state rather than returning a >>> queriable object: >>> https://github.com/llvm-mirror/llvm/blob/a3de0cbb8f4d886a968d20a8c6a6e8aa01d28c2a/include/llvm/Demangle/Demangle.h#L36 >>> * IPD's rich info could help LLDB, where it currently parses mangled >>> names on its own, on-top of demangling. Symtab::InitNameIndexes() seems >>> to be the most prominent such place. LLDB builds an index with various >>> categories from all its symbols here. This is performance-critical and >>> it does not benefit from the laziness in GetDemangledName(): >>> https://github.com/llvm-mirror/lldb/blob/8ba903256fd92a2d8644b108a7c8a1a15efd90ad/source/Symbol/Symtab.cpp#L218 >>> >>> My simple switch doesn't exploit IPD's rich demangling info yet and it >>> uses a new IPD instance for each demangling request, which is considered >>> quite costly as it uses a bump allocator internally. Over-all >>> performance still didn't drop, but even seems to benefit. >>> >>> In order to fully exploit the remaining potential, I am thinking about >>> the following changes: >>> >>> (1) In the Mangled class, add a separate entry-point for batch >>> demangling, that allows to pass in an existing IPD: >>> bool Mangled::DemangleWithRichNameIndexInfo(ItaniumPartialDemangler &IPD); >>> >>> (2) DemangleWithRichNameIndexInfo() will demangle explicitly, which is >>> required to make sure we gather IPD's rich info. It's not lazy as >>> GetDemangledName(), but it will store the demangled name and set the >>> "MangledCounterpart" so that subsequent lazy requests will be fast. >>> >>> (3) DemangleWithRichNameIndexInfo() will be used by >>> Symtab::InitNameIndexes(), which will have a single IPD instance that is >>> reused for all symbols. Symtab::InitNameIndexes() is usually called >>> before anything else, so it is basically "warming the cache" here. >>> >>> (4) Finally, with IPD's rich info, we can get rid of the additional >>> string parsing in Symtab::InitNameIndexes(). I expect a considerable >>> speedup here too. >>> >>> What do you think about the plan? >>> Do you think it's a good idea to add DemangleWithRichNameIndexInfo() >>> like this? >>> Are you aware of more batch-processing places like >>> Symtab::InitNameIndexes(), that I should consider as clients for >>> DemangleWithRichNameIndexInfo()? >>> Do you know potential side-effects I must be aware of? >>> Would you consider the evidence on the performance benefits convincing, >>> or do you think it needs bulletproof benchmarking numbers? >>> >>> When it comes to MSVC-mangled names: >>> * It is certainly necessary to keep a legacy version of the current >>> categorization mechanism for these. But in general, what do you think >>> about their importance for LLDB? (Personally I would like to see LLDB on >>> Windows, but I tried it only once and gave up quickly.) >>> * I saw there is a new MicrosoftDemangler now in LLVM. Does anyone know >>> more about it? Especially: Are there plans to provide rich demangling >>> information similar to the IPD? >>> >>> So far I started writing a unit test for Symtab::InitNameIndexes(), so I >>> won't accidentally break its indexing. I also experimented with a >>> potential DemangleWithRichNameIndexInfo() and had a look on the numbers >>> of the internal LLDB timers. This was, however, not exhaustive and real >>> benchmarking is always hard. >>> >>> Thanks for all kinds of feedback. >>> >>> Best, >>> Stefan >>> >>> -- https://weliveindetail.github.io/blog/ https://cryptup.org/pub/stefan.graen...@gmail.com _______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev