A simpler solution is to remove the lock in Module::GetDescription(...) all of 
the data that is used to dump the module description is set correctly (the 
file, the arch and the object name for BSD archives)) very early on when a 
module is created before anyone should be making symbol or debug info queries 
or parsing.


> On Feb 5, 2021, at 1:24 PM, Jorge Gorbe Moya via lldb-dev 
> <lldb-dev@lists.llvm.org> wrote:
> 
> I just started looking at how to do this (having a separate mutex for the 
> description) and I think I found another bug. Or maybe I'm missing some 
> assumption.
> 
> On one hand, Module::SetArchitecture won't assign the new value if m_arch is 
> already valid, just return m_arch.IsCompatibleWith(new_arch). On the other 
> hand, in Module::MergeArchitecture we have both code that assumes that 
> SetArchitecture replaces the previous value:
> 
>   if (!m_arch 
> <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/include/lldb/Core/Module.h;rcl=355613052;l=939>.IsCompatibleMatch
>  
> <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Utility/ArchSpec.cpp;rcl=355613052;l=934>(arch_spec
>  
> <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Core/Module.cpp;l=1620?q=IsCompatibleMatch%20lldb&ct=os&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental>))
>  {
>     // The new architecture is different, we just need to replace it.
>     return SetArchitecture 
> <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Core/Module.cpp;l=1544?q=IsCompatibleMatch%20lldb&ct=os&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental>(arch_spec
>  
> <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Core/Module.cpp;l=1620?q=IsCompatibleMatch%20lldb&ct=os&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental>);
>   }
> 
> and right after it we have code that works around the fact that 
> SetArchitecture doesn't replace the previous value sometimes
> 
>   // Merge bits from arch_spec into "merged_arch" and set our architecture.
>   ArchSpec 
> <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/include/lldb/Utility/ArchSpec.h;rcl=355613052;l=33>
>  merged_arch 
> <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Core/Module.cpp;bpv=1;bpt=1;l=1633?q=IsCompatibleMatch%20lldb&ct=os&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental&gsn=merged_arch&gs=kythe%3A%2F%2Fgoogle3%3Flang%3Dc%252B%252B%3Fpath%3Dthird_party%2Fllvm%2Fllvm-project%2Flldb%2Fsource%2FCore%2FModule.cpp%23f1bDLAwnC-gD__Xt6Bs-4h5T-ePUhWBq8L6dQxa8QEA>(m_arch
>  
> <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/include/lldb/Core/Module.h;rcl=355613052;l=939>);
>   merged_arch 
> <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Core/Module.cpp;l=1633?q=IsCompatibleMatch%20lldb&ct=os&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental>.MergeFrom
>  
> <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Utility/ArchSpec.cpp;rcl=355613052;l=801>(arch_spec
>  
> <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Core/Module.cpp;l=1620?q=IsCompatibleMatch%20lldb&ct=os&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental>);
>   // SetArchitecture() is a no-op if m_arch is already valid.
>   m_arch 
> <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/include/lldb/Core/Module.h;rcl=355613052;l=939>
>  = 
> <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/include/lldb/Utility/ArchSpec.h;rcl=355613052;l=33>
>  ArchSpec 
> <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Utility/ArchSpec.cpp;rcl=355613052;l=510>();
>   return SetArchitecture 
> <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Core/Module.cpp;l=1544?q=IsCompatibleMatch%20lldb&ct=os&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental>(merged_arch
>  
> <https://source.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/lldb/source/Core/Module.cpp;l=1633?q=IsCompatibleMatch%20lldb&ct=os&sq=package:piper%20file:%2F%2Fdepot%2Fgoogle3%20-file:google3%2Fexperimental>);
> 
> My guess is that the first of the snippets above has a bug (it intends to 
> replace the architecture but doesn't) and that replacing a module's 
> architecture with an incompatible one is something that doesn't happen often 
> and the bug went unnoticed. Does this make sense?
> 
> On Thu, Feb 4, 2021 at 5:38 PM Raphael Isemann <teempe...@gmail.com 
> <mailto:teempe...@gmail.com>> wrote:
> We could also just give the Module a std::string with the description
> and update it in the few places where we actually update it. The
> m_arch already has a setter in place that just needs to be used in a
> few more places, so the infrastructure is kind of already there (at
> least for m_arch). The description would just have its own mutex.
> 
> 
> Am Fr., 5. Feb. 2021 um 00:39 Uhr schrieb Jorge Gorbe Moya via
> lldb-dev <lldb-dev@lists.llvm.org <mailto:lldb-dev@lists.llvm.org>>:
> >
> > Wouldn't it be preferable to try_lock in GetDescription (which is the one 
> > currently acquiring the mutex) instead? ReportError doesn't touch any mutex 
> > itself and will happily report the rest of the error if GetDescription 
> > bails out. For the test case I sent it would look like this:
> >
> > error:  {0x0000000b}: invalid abbreviation code 123, please file a bug and 
> > attach the file at the start of this error message
> > error:  {0x0000000b}: invalid abbreviation code 123, please file a bug and 
> > attach the file at the start of this error message
> > error:  {0x0000000b}: invalid abbreviation code 123, please file a bug and 
> > attach the file at the start of this error message
> >
> > which is way better than a deadlock IMO.
> >
> >
> >
> >
> > On Thu, Feb 4, 2021 at 12:16 PM Pavel Labath <pa...@labath.sk 
> > <mailto:pa...@labath.sk>> wrote:
> >>
> >> Please have a look at
> >> <https://lists.llvm.org/pipermail/lldb-dev/2020-October/016479.html 
> >> <https://lists.llvm.org/pipermail/lldb-dev/2020-October/016479.html>>,
> >> which is the last time this came up.
> >>
> >>
> >> One quick'n'dirty solution would be to have `Module::ReportError` _try_
> >> to get the module lock, and if it fails, just bail out. That obviously
> >> means you won't get to see the error message which triggerred the
> >> deadlock (though we could also play around with that and try printing
> >> the error message without the module description or something), but it
> >> will at least get you past that point...
> >>
> >> pl
> >>
> >> On 04/02/2021 21:04, Jorge Gorbe Moya via lldb-dev wrote:
> >> > Hi,
> >> >
> >> > I've found a deadlock in lldb (see attached test case, you can build it
> >> > with just `clang -o test test.s`), but I'm a total newbie and I have no
> >> > idea what's the right way to fix it.
> >> >
> >> > The problem happens when an error is found during DIE extraction when
> >> > preloading symbols. As far as I can tell, it goes like this:
> >> >
> >> > 1. Module::PreloadSymbols locks Module::m_mutex
> >> > 2. A few layers below it, we end up in ManualDWARFIndex::Index, which
> >> > dispatches DIE extractions to a thread pool:
> >> >
> >> > |for (size_t i = 0; i < units_to_index.size(); ++i)
> >> > pool.async(extract_fn, i); pool.wait(); |
> >> >
> >> > 3. extract_fn in the snippet above ends up executing
> >> > DWARFDebugInfoEntry::Extract and when there's an error during
> >> > extraction, Module::GetDescription is called while generating the error
> >> > message.
> >> > 4. Module::GetDescription tries to acquire Module::m_mutex from a
> >> > different thread, while the main thread has the mutex already locked and
> >> > it's waiting for DIE extraction to end, causing a deadlock.
> >> >
> >> > If we make Module::GetDescription not lock the problem disappears, so
> >> > the diagnosis looks correct, but I don't know what would be the right
> >> > way to fix it. Module::GetDescription looks more or less safe to call
> >> > without locking: it just prints m_arch, m_file, and m_object_name to a
> >> > string, and those look like fields that wouldn't change after the Module
> >> > is initialized, so maybe it's okay? But I feel like there must be a
> >> > better solution anyway. Any advice?
> >> >
> >> > Best,
> >> > Jorge
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > _______________________________________________
> >> > lldb-dev mailing list
> >> > lldb-dev@lists.llvm.org <mailto:lldb-dev@lists.llvm.org>
> >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev 
> >> > <https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev>
> >> >
> >>
> > _______________________________________________
> > lldb-dev mailing list
> > lldb-dev@lists.llvm.org <mailto:lldb-dev@lists.llvm.org>
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev 
> > <https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev>
> _______________________________________________
> lldb-dev mailing list
> lldb-dev@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

_______________________________________________
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Reply via email to