I'm observing the following deadlock:

One thread calls Module::PreloadSymbols() which takes m_mutex of this Module. 
Module::PreloadSymbols()
calls ManualDWARFIndex::Index(), which, in turn, creates thread pool and waits 
for all threads completion:

(gdb)
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x7f67f176914c) at 
../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7f67f17690c8, 
cond=0x7f67f1769120) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=0x7f67f1769120, mutex=0x7f67f17690c8) at 
pthread_cond_wait.c:638
#3  0x00007f67f3974890 in 
std::condition_variable::wait(std::unique_lock<std::mutex>&) () from 
/lib64/libstdc++.so.6
#4  0x00007f67f4440c4b in 
std::condition_variable::wait<llvm::ThreadPool::wait()::<lambda()> > (__p=..., 
__lock=..., this=0x7f67f1769120)
    at /usr/include/c++/10/condition_variable:108
#5  llvm::ThreadPool::wait (this=this@entry=0x7f67f1769060) at 
source/llvm/lib/Support/ThreadPool.cpp:72
#6  0x00007f67fc6fa3a6 in lldb_private::ManualDWARFIndex::Index 
(this=0x7f66fe87e950)
    at source/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp:94
#7  0x00007f67fc6b3825 in SymbolFileDWARF::PreloadSymbols (this=0x7f67de7af6f0) 
at /usr/include/c++/10/bits/unique_ptr.h:421
#8  0x00007f67fc1ee488 in lldb_private::Module::PreloadSymbols 
(this=0x7f67de79b620) at source/lldb/source/Core/Module.cpp:1397
#9  0x00007f67fc397a37 in lldb_private::Target::GetOrCreateModule 
(this=this@entry=0x96c7a0, module_spec=..., notify=notify@entry=true, 
error_ptr=error_ptr@entry=0x0)
    at /usr/include/c++/10/bits/shared_ptr_base.h:1324
...

OTOH one of pool threads makes an attempt to lock Module's mutex:

(gdb) bt
#0  __lll_lock_wait (futex=futex@entry=0x7f67de79b638, private=0) at 
lowlevellock.c:52
#1  0x00007f67fcd907f1 in __GI___pthread_mutex_lock (mutex=0x7f67de79b638) at 
../nptl/pthread_mutex_lock.c:115
#2  0x00007f67fc1ed922 in __gthread_mutex_lock (__mutex=0x7f67de79b638) at 
/usr/include/c++/10/x86_64-redhat-linux/bits/gthr-default.h:749
#3  __gthread_recursive_mutex_lock (__mutex=0x7f67de79b638) at 
/usr/include/c++/10/x86_64-redhat-linux/bits/gthr-default.h:811
#4  std::recursive_mutex::lock (this=0x7f67de79b638) at 
/usr/include/c++/10/mutex:106
#5  std::lock_guard<std::recursive_mutex>::lock_guard (__m=..., this=<synthetic 
pointer>) at /usr/include/c++/10/bits/std_mutex.h:159
#6  lldb_private::Module::GetDescription (this=this@entry=0x7f67de79b620, 
s=..., level=level@entry=lldb::eDescriptionLevelBrief)
    at source/lldb/source/Core/Module.cpp:1083
#7  0x00007f67fc1f2070 in lldb_private::Module::ReportError (this=0x7f67de79b620, 
format=0x7f67fca03660 "DW_FORM_ref* DIE reference 0x%lx is outside of its CU")
    at source/lldb/include/lldb/Utility/Stream.h:358
#8  0x00007f67fc6adfb4 in DWARFFormValue::Reference 
(this=this@entry=0x7f66f5ff29c0) at 
/usr/include/c++/10/bits/shared_ptr_base.h:1324
#9  0x00007f67fc6aaa77 in DWARFDebugInfoEntry::GetAttributes 
(this=this@entry=0x7f662e3580e0, cu=cu@entry=0x7f66ff6ebad0, attributes=...,
    recurse=recurse@entry=DWARFBaseDIE::Recurse::yes, 
curr_depth=curr_depth@entry=0)
    at source/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp:439
#10 0x00007f67fc6f8f8f in DWARFDebugInfoEntry::GetAttributes 
(recurse=DWARFBaseDIE::Recurse::yes, attrs=..., cu=0x7f66ff6ebad0, 
this=0x7f662e3580e0)
    at source/lldb/source/./Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.h:54
#11 lldb_private::ManualDWARFIndex::IndexUnitImpl (unit=..., 
cu_language=cu_language@entry=lldb::eLanguageTypeRust, set=...)
    at source/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp:180
#12 0x00007f67fc6f96b7 in lldb_private::ManualDWARFIndex::IndexUnit 
(this=<optimized out>, unit=..., dwp=0x0, set=...)
    at source/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp:126
...

So this is a deadlock because thread pool is created with module lock held, and 
one (or more,
I'm observing two) pool thread(s) might want to grab the same lock to issue an 
error message.

Commenting out the whole body of Module::GetDescription() makes this deadlock 
disappear.

I'm not an expert in this area, but the whole thing looks like the Module 
object should have more
fine-granted locking rather than the only std::recursive_mutex for all 
synchronization purposes.

Dmitry
_______________________________________________
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Reply via email to