A simpler solution is to remove the lock in Module::GetDescription(...) all of the data that is used to dump the module description is set correctly (the file, the arch and the object name for BSD archives)) very early on when a module is created before anyone should be making symbol or debug info queries or parsing.
> On Feb 5, 2021, at 1:24 PM, Jorge Gorbe Moya via lldb-dev > <lldb-dev@lists.llvm.org> wrote: > > I just started looking at how to do this (having a separate mutex for the > description) and I think I found another bug. Or maybe I'm missing some > assumption. > > On one hand, Module::SetArchitecture won't assign the new value if m_arch is > already valid, just return m_arch.IsCompatibleWith(new_arch). On the other > hand, in Module::MergeArchitecture we have both code that assumes that > SetArchitecture replaces the previous value: > > if (!m_arch > <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/include/lldb/Core/Module.h;rcl=355613052;l=939>.IsCompatibleMatch > > <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Utility/ArchSpec.cpp;rcl=355613052;l=934>(arch_spec > > <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Core/Module.cpp;l=1620?q=IsCompatibleMatch%20lldb&ct=os&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental>)) > { > // The new architecture is different, we just need to replace it. > return SetArchitecture > <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Core/Module.cpp;l=1544?q=IsCompatibleMatch%20lldb&ct=os&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental>(arch_spec > > <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Core/Module.cpp;l=1620?q=IsCompatibleMatch%20lldb&ct=os&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental>); > } > > and right after it we have code that works around the fact that > SetArchitecture doesn't replace the previous value sometimes > > // Merge bits from arch_spec into "merged_arch" and set our architecture. > ArchSpec > <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/include/lldb/Utility/ArchSpec.h;rcl=355613052;l=33> > merged_arch > <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Core/Module.cpp;bpv=1;bpt=1;l=1633?q=IsCompatibleMatch%20lldb&ct=os&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental&gsn=merged_arch&gs=kythe%3A%2F%2Fgoogle3%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Fllvm%2Fllvm-project%2Flldb%2Fsource%2FCore%2FModule.cpp%23f1bDLAwnC-gD__Xt6Bs-4h5T-ePUhWBq8L6dQxa8QEA>(m_arch > > <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/include/lldb/Core/Module.h;rcl=355613052;l=939>); > merged_arch > <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Core/Module.cpp;l=1633?q=IsCompatibleMatch%20lldb&ct=os&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental>.MergeFrom > > <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Utility/ArchSpec.cpp;rcl=355613052;l=801>(arch_spec > > <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Core/Module.cpp;l=1620?q=IsCompatibleMatch%20lldb&ct=os&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental>); > // SetArchitecture() is a no-op if m_arch is already valid. > m_arch > <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/include/lldb/Core/Module.h;rcl=355613052;l=939> > = > <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/include/lldb/Utility/ArchSpec.h;rcl=355613052;l=33> > ArchSpec > <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Utility/ArchSpec.cpp;rcl=355613052;l=510>(); > return SetArchitecture > <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Core/Module.cpp;l=1544?q=IsCompatibleMatch%20lldb&ct=os&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental>(merged_arch > > <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Core/Module.cpp;l=1633?q=IsCompatibleMatch%20lldb&ct=os&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental>); > > My guess is that the first of the snippets above has a bug (it intends to > replace the architecture but doesn't) and that replacing a module's > architecture with an incompatible one is something that doesn't happen often > and the bug went unnoticed. Does this make sense? > > On Thu, Feb 4, 2021 at 5:38 PM Raphael Isemann <teempe...@gmail.com > <mailto:teempe...@gmail.com>> wrote: > We could also just give the Module a std::string with the description > and update it in the few places where we actually update it. The > m_arch already has a setter in place that just needs to be used in a > few more places, so the infrastructure is kind of already there (at > least for m_arch). The description would just have its own mutex. > > > Am Fr., 5. Feb. 2021 um 00:39 Uhr schrieb Jorge Gorbe Moya via > lldb-dev <lldb-dev@lists.llvm.org <mailto:lldb-dev@lists.llvm.org>>: > > > > Wouldn't it be preferable to try_lock in GetDescription (which is the one > > currently acquiring the mutex) instead? ReportError doesn't touch any mutex > > itself and will happily report the rest of the error if GetDescription > > bails out. For the test case I sent it would look like this: > > > > error: {0x0000000b}: invalid abbreviation code 123, please file a bug and > > attach the file at the start of this error message > > error: {0x0000000b}: invalid abbreviation code 123, please file a bug and > > attach the file at the start of this error message > > error: {0x0000000b}: invalid abbreviation code 123, please file a bug and > > attach the file at the start of this error message > > > > which is way better than a deadlock IMO. > > > > > > > > > > On Thu, Feb 4, 2021 at 12:16 PM Pavel Labath <pa...@labath.sk > > <mailto:pa...@labath.sk>> wrote: > >> > >> Please have a look at > >> <https://lists.llvm.org/pipermail/lldb-dev/2020-October/016479.html > >> <https://lists.llvm.org/pipermail/lldb-dev/2020-October/016479.html>>, > >> which is the last time this came up. > >> > >> > >> One quick'n'dirty solution would be to have `Module::ReportError` _try_ > >> to get the module lock, and if it fails, just bail out. That obviously > >> means you won't get to see the error message which triggerred the > >> deadlock (though we could also play around with that and try printing > >> the error message without the module description or something), but it > >> will at least get you past that point... > >> > >> pl > >> > >> On 04/02/2021 21:04, Jorge Gorbe Moya via lldb-dev wrote: > >> > Hi, > >> > > >> > I've found a deadlock in lldb (see attached test case, you can build it > >> > with just `clang -o test test.s`), but I'm a total newbie and I have no > >> > idea what's the right way to fix it. > >> > > >> > The problem happens when an error is found during DIE extraction when > >> > preloading symbols. As far as I can tell, it goes like this: > >> > > >> > 1. Module::PreloadSymbols locks Module::m_mutex > >> > 2. A few layers below it, we end up in ManualDWARFIndex::Index, which > >> > dispatches DIE extractions to a thread pool: > >> > > >> > |for (size_t i = 0; i < units_to_index.size(); ++i) > >> > pool.async(extract_fn, i); pool.wait(); | > >> > > >> > 3. extract_fn in the snippet above ends up executing > >> > DWARFDebugInfoEntry::Extract and when there's an error during > >> > extraction, Module::GetDescription is called while generating the error > >> > message. > >> > 4. Module::GetDescription tries to acquire Module::m_mutex from a > >> > different thread, while the main thread has the mutex already locked and > >> > it's waiting for DIE extraction to end, causing a deadlock. > >> > > >> > If we make Module::GetDescription not lock the problem disappears, so > >> > the diagnosis looks correct, but I don't know what would be the right > >> > way to fix it. Module::GetDescription looks more or less safe to call > >> > without locking: it just prints m_arch, m_file, and m_object_name to a > >> > string, and those look like fields that wouldn't change after the Module > >> > is initialized, so maybe it's okay? But I feel like there must be a > >> > better solution anyway. Any advice? > >> > > >> > Best, > >> > Jorge > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > _______________________________________________ > >> > lldb-dev mailing list > >> > lldb-dev@lists.llvm.org <mailto:lldb-dev@lists.llvm.org> > >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev > >> > <https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev> > >> > > >> > > _______________________________________________ > > lldb-dev mailing list > > lldb-dev@lists.llvm.org <mailto:lldb-dev@lists.llvm.org> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev > > <https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev> > _______________________________________________ > lldb-dev mailing list > lldb-dev@lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
_______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev