Author: Tom Yang Date: 2025-03-31T13:29:31-07:00 New Revision: a8d2d169c7add4b0106ae76e186cf815c0b84825
URL: https://github.com/llvm/llvm-project/commit/a8d2d169c7add4b0106ae76e186cf815c0b84825 DIFF: https://github.com/llvm/llvm-project/commit/a8d2d169c7add4b0106ae76e186cf815c0b84825.diff LOG: Parallelize module loading in POSIX dyld code (#130912) This patch improves LLDB launch time on Linux machines for **preload scenarios**, particularly for executables with a lot of shared library dependencies (or modules). Specifically: * Launching a binary with `target.preload-symbols = true` * Attaching to a process with `target.preload-symbols = true`. It's completely controlled by a new flag added in the first commit `plugin.dynamic-loader.posix-dyld.parallel-module-load`, which *defaults to false*. This was inspired by similar work on Darwin #110646. Some rough numbers to showcase perf improvement, run on a very beefy machine: * Executable with ~5600 modules: baseline 45s, improvement 15s * Executable with ~3800 modules: baseline 25s, improvement 10s * Executable with ~6650 modules: baseline 67s, improvement 20s * Executable with ~12500 modules: baseline 185s, improvement 85s * Executable with ~14700 modules: baseline 235s, improvement 120s A lot of targets we deal with have a *ton* of modules, and unfortunately we're unable to convince other folks to reduce the number of modules, so performance improvements like this can be very impactful for user experience. This patch achieves the performance improvement by parallelizing `DynamicLoaderPOSIXDYLD::RefreshModules` for the launch scenario, and `DynamicLoaderPOSIXDYLD::LoadAllCurrentModules` for the attach scenario. The commits have some context on their specific changes as well -- hopefully this helps the review. # More context on implementation We discovered the bottlenecks by via `perf record -g -p <lldb's pid>` on a Linux machine. With an executable known to have 1000s of shared library dependencies, I ran ``` (lldb) b main (lldb) r # taking a while ``` and showed the resulting perf trace (snippet shown) ``` Samples: 85K of event 'cycles:P', Event count (approx.): 54615855812 Children Self Command Shared Object Symbol - 93.54% 0.00% intern-state libc.so.6 [.] clone3 clone3 start_thread lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(void*) r std::_Function_handler<void* (), lldb_private::Process::StartPrivateStateThread(bool)::$_0>::_M_invoke(std::_Any_data const&) lldb_private::Process::RunPrivateStateThread(bool) n - lldb_private::Process::HandlePrivateEvent(std::shared_ptr<lldb_private::Event>&) - 93.54% lldb_private::Process::ShouldBroadcastEvent(lldb_private::Event*) - 93.54% lldb_private::ThreadList::ShouldStop(lldb_private::Event*) - lldb_private::Thread::ShouldStop(lldb_private::Event*) * - 93.53% lldb_private::StopInfoBreakpoint::ShouldStopSynchronous(lldb_private::Event*) t - 93.52% lldb_private::BreakpointSite::ShouldStop(lldb_private::StoppointCallbackContext*) i lldb_private::BreakpointLocationCollection::ShouldStop(lldb_private::StoppointCallbackContext*) k lldb_private::BreakpointLocation::ShouldStop(lldb_private::StoppointCallbackContext*) b lldb_private::BreakpointOptions::InvokeCallback(lldb_private::StoppointCallbackContext*, unsigned long, unsigned long) i DynamicLoaderPOSIXDYLD::RendezvousBreakpointHit(void*, lldb_private::StoppointCallbackContext*, unsigned long, unsigned lo - DynamicLoaderPOSIXDYLD::RefreshModules() O - 93.42% DynamicLoaderPOSIXDYLD::RefreshModules()::$_0::operator()(DYLDRendezvous::SOEntry const&) const u - 93.40% DynamicLoaderPOSIXDYLD::LoadModuleAtAddress(lldb_private::FileSpec const&, unsigned long, unsigned long, bools - lldb_private::DynamicLoader::LoadModuleAtAddress(lldb_private::FileSpec const&, unsigned long, unsigned long, boos - 83.90% lldb_private::DynamicLoader::FindModuleViaTarget(lldb_private::FileSpec const&) o - 83.01% lldb_private::Target::GetOrCreateModule(lldb_private::ModuleSpec const&, bool, lldb_private::Status* - 77.89% lldb_private::Module::PreloadSymbols() - 44.06% lldb_private::Symtab::PreloadSymbols() - 43.66% lldb_private::Symtab::InitNameIndexes() ... ``` We saw that majority of time was spent in `RefreshModules`, with the main culprit within it `LoadModuleAtAddress` which eventually calls `PreloadSymbols`. At first, `DynamicLoaderPOSIXDYLD::LoadModuleAtAddress` appears fairly independent -- most of it deals with different files and then getting or creating Modules from these files. The portions that aren't independent seem to deal with ModuleLists, which appear concurrency safe. There were members of `DynamicLoaderPOSIXDYLD` I had to synchronize though: namely `m_loaded_modules` which `DynamicLoaderPOSIXDYLD` maintains to map its loaded modules to their link addresses. Without synchronizing this, I ran into SEGFAULTS and other issues when running `check-lldb`. I also locked the assignment and comparison of `m_interpreter_module`, which may be unnecessary. # Alternate implementations When creating this patch, another implementation I considered was directly background-ing the call to `Module::PreloadSymbol` in `Target::GetOrCreateModule`. It would have the added benefit of working across platforms generically, and appeared to be concurrency safe. It was done via `Debugger::GetThreadPool().async` directly. However, there were a ton of concurrency issues, so I abandoned that approach for now. # Testing With the feature active, I tested via `ninja check-lldb` on both Debug and Release builds several times (~5 or 6 altogether?), and didn't spot additional failing or flaky tests. I also tested manually on several different binaries, some with around 14000 modules, but just basic operations: launching, reaching main, setting breakpoint, stepping, showing some backtraces. I've also tested with the flag off just to make sure things behave properly synchronously. Added: Modified: lldb/include/lldb/Target/Target.h lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.h lldb/source/Target/Target.cpp lldb/source/Target/TargetProperties.td Removed: ################################################################################ diff --git a/lldb/include/lldb/Target/Target.h b/lldb/include/lldb/Target/Target.h index 3cdbe9221a0bc..29183cc267721 100644 --- a/lldb/include/lldb/Target/Target.h +++ b/lldb/include/lldb/Target/Target.h @@ -118,6 +118,8 @@ class TargetProperties : public Properties { llvm::StringRef GetLaunchWorkingDirectory() const; + bool GetParallelModuleLoad() const; + const char *GetDisassemblyFlavor() const; const char *GetDisassemblyCPU() const; diff --git a/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp b/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp index 53ba11ac21bd3..326b6910b5267 100644 --- a/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp +++ b/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.cpp @@ -10,6 +10,7 @@ #include "DynamicLoaderPOSIXDYLD.h" #include "lldb/Breakpoint/BreakpointLocation.h" +#include "lldb/Core/Debugger.h" #include "lldb/Core/Module.h" #include "lldb/Core/ModuleSpec.h" #include "lldb/Core/PluginManager.h" @@ -25,6 +26,7 @@ #include "lldb/Utility/LLDBLog.h" #include "lldb/Utility/Log.h" #include "lldb/Utility/ProcessInfo.h" +#include "llvm/Support/ThreadPool.h" #include <memory> #include <optional> @@ -184,16 +186,37 @@ void DynamicLoaderPOSIXDYLD::DidLaunch() { Status DynamicLoaderPOSIXDYLD::CanLoadImage() { return Status(); } +void DynamicLoaderPOSIXDYLD::SetLoadedModule(const ModuleSP &module_sp, + addr_t link_map_addr) { + llvm::sys::ScopedWriter lock(m_loaded_modules_rw_mutex); + m_loaded_modules[module_sp] = link_map_addr; +} + +void DynamicLoaderPOSIXDYLD::UnloadModule(const ModuleSP &module_sp) { + llvm::sys::ScopedWriter lock(m_loaded_modules_rw_mutex); + m_loaded_modules.erase(module_sp); +} + +std::optional<lldb::addr_t> +DynamicLoaderPOSIXDYLD::GetLoadedModuleLinkAddr(const ModuleSP &module_sp) { + llvm::sys::ScopedReader lock(m_loaded_modules_rw_mutex); + auto it = m_loaded_modules.find(module_sp); + if (it != m_loaded_modules.end()) + return it->second; + return std::nullopt; +} + void DynamicLoaderPOSIXDYLD::UpdateLoadedSections(ModuleSP module, addr_t link_map_addr, addr_t base_addr, bool base_addr_is_offset) { - m_loaded_modules[module] = link_map_addr; + SetLoadedModule(module, link_map_addr); + UpdateLoadedSectionsCommon(module, base_addr, base_addr_is_offset); } void DynamicLoaderPOSIXDYLD::UnloadSections(const ModuleSP module) { - m_loaded_modules.erase(module); + UnloadModule(module); UnloadSectionsCommon(module); } @@ -401,7 +424,7 @@ void DynamicLoaderPOSIXDYLD::RefreshModules() { // The rendezvous class doesn't enumerate the main module, so track that // ourselves here. ModuleSP executable = GetTargetExecutable(); - m_loaded_modules[executable] = m_rendezvous.GetLinkMapAddress(); + SetLoadedModule(executable, m_rendezvous.GetLinkMapAddress()); DYLDRendezvous::iterator I; DYLDRendezvous::iterator E; @@ -423,34 +446,70 @@ void DynamicLoaderPOSIXDYLD::RefreshModules() { E = m_rendezvous.end(); m_initial_modules_added = true; } - for (; I != E; ++I) { - // Don't load a duplicate copy of ld.so if we have already loaded it - // earlier in LoadInterpreterModule. If we instead loaded then unloaded it - // later, the section information for ld.so would be removed. That - // information is required for placing breakpoints on Arm/Thumb systems. - if ((m_interpreter_module.lock() != nullptr) && - (I->base_addr == m_interpreter_base)) - continue; - - ModuleSP module_sp = - LoadModuleAtAddress(I->file_spec, I->link_addr, I->base_addr, true); - if (!module_sp.get()) - continue; - - if (module_sp->GetObjectFile()->GetBaseAddress().GetLoadAddress( - &m_process->GetTarget()) == m_interpreter_base) { - ModuleSP interpreter_sp = m_interpreter_module.lock(); - if (m_interpreter_module.lock() == nullptr) { - m_interpreter_module = module_sp; - } else if (module_sp == interpreter_sp) { - // Module already loaded. - continue; - } - } - loaded_modules.AppendIfNeeded(module_sp); - new_modules.Append(module_sp); + // Synchronize reading and writing of `m_interpreter_module`. + std::mutex interpreter_module_mutex; + // We should be able to take SOEntry as reference since the data + // exists for the duration of this call in `m_rendezvous`. + auto load_module_fn = + [this, &loaded_modules, &new_modules, + &interpreter_module_mutex](const DYLDRendezvous::SOEntry &so_entry) { + // Don't load a duplicate copy of ld.so if we have already loaded it + // earlier in LoadInterpreterModule. If we instead loaded then + // unloaded it later, the section information for ld.so would be + // removed. That information is required for placing breakpoints on + // Arm/Thumb systems. + { + // `m_interpreter_module` may be modified by another thread at the + // same time, so we guard the access here. + std::lock_guard<std::mutex> lock(interpreter_module_mutex); + if ((m_interpreter_module.lock() != nullptr) && + (so_entry.base_addr == m_interpreter_base)) + return; + } + + ModuleSP module_sp = LoadModuleAtAddress( + so_entry.file_spec, so_entry.link_addr, so_entry.base_addr, true); + if (!module_sp.get()) + return; + + { + // `m_interpreter_module` may be modified by another thread at the + // same time, so we guard the access here. + std::lock_guard<std::mutex> lock(interpreter_module_mutex); + // Set the interpreter module, if this is the interpreter. + if (module_sp->GetObjectFile()->GetBaseAddress().GetLoadAddress( + &m_process->GetTarget()) == m_interpreter_base) { + ModuleSP interpreter_sp = m_interpreter_module.lock(); + if (m_interpreter_module.lock() == nullptr) { + m_interpreter_module = module_sp; + } else if (module_sp == interpreter_sp) { + // Module already loaded. + return; + } + } + } + + // Note: in a multi-threaded environment, these module lists may be + // appended to out-of-order. This is fine, since there's no + // expectation for `loaded_modules` or `new_modules` to be in any + // particular order, and appending to each module list is thread-safe. + // Also, `new_modules` is only used for the `ModulesDidLoad` call at + // the end of this function. + loaded_modules.AppendIfNeeded(module_sp); + new_modules.Append(module_sp); + }; + + if (m_process->GetTarget().GetParallelModuleLoad()) { + llvm::ThreadPoolTaskGroup task_group(Debugger::GetThreadPool()); + for (; I != E; ++I) + task_group.async(load_module_fn, *I); + task_group.wait(); + } else { + for (; I != E; ++I) + load_module_fn(*I); } + m_process->GetTarget().ModulesDidLoad(new_modules); } @@ -636,7 +695,7 @@ void DynamicLoaderPOSIXDYLD::LoadAllCurrentModules() { // The rendezvous class doesn't enumerate the main module, so track that // ourselves here. ModuleSP executable = GetTargetExecutable(); - m_loaded_modules[executable] = m_rendezvous.GetLinkMapAddress(); + SetLoadedModule(executable, m_rendezvous.GetLinkMapAddress()); std::vector<FileSpec> module_names; for (I = m_rendezvous.begin(), E = m_rendezvous.end(); I != E; ++I) @@ -644,19 +703,31 @@ void DynamicLoaderPOSIXDYLD::LoadAllCurrentModules() { m_process->PrefetchModuleSpecs( module_names, m_process->GetTarget().GetArchitecture().GetTriple()); - for (I = m_rendezvous.begin(), E = m_rendezvous.end(); I != E; ++I) { - ModuleSP module_sp = - LoadModuleAtAddress(I->file_spec, I->link_addr, I->base_addr, true); + auto load_module_fn = [this, &module_list, + &log](const DYLDRendezvous::SOEntry &so_entry) { + ModuleSP module_sp = LoadModuleAtAddress( + so_entry.file_spec, so_entry.link_addr, so_entry.base_addr, true); if (module_sp.get()) { LLDB_LOG(log, "LoadAllCurrentModules loading module: {0}", - I->file_spec.GetFilename()); + so_entry.file_spec.GetFilename()); module_list.Append(module_sp); } else { Log *log = GetLog(LLDBLog::DynamicLoader); LLDB_LOGF( log, "DynamicLoaderPOSIXDYLD::%s failed loading module %s at 0x%" PRIx64, - __FUNCTION__, I->file_spec.GetPath().c_str(), I->base_addr); + __FUNCTION__, so_entry.file_spec.GetPath().c_str(), + so_entry.base_addr); + } + }; + if (m_process->GetTarget().GetParallelModuleLoad()) { + llvm::ThreadPoolTaskGroup task_group(Debugger::GetThreadPool()); + for (I = m_rendezvous.begin(), E = m_rendezvous.end(); I != E; ++I) + task_group.async(load_module_fn, *I); + task_group.wait(); + } else { + for (I = m_rendezvous.begin(), E = m_rendezvous.end(); I != E; ++I) { + load_module_fn(*I); } } @@ -728,15 +799,15 @@ DynamicLoaderPOSIXDYLD::GetThreadLocalData(const lldb::ModuleSP module_sp, const lldb::ThreadSP thread, lldb::addr_t tls_file_addr) { Log *log = GetLog(LLDBLog::DynamicLoader); - auto it = m_loaded_modules.find(module_sp); - if (it == m_loaded_modules.end()) { + std::optional<addr_t> link_map_addr_opt = GetLoadedModuleLinkAddr(module_sp); + if (!link_map_addr_opt.has_value()) { LLDB_LOGF( log, "GetThreadLocalData error: module(%s) not found in loaded modules", module_sp->GetObjectName().AsCString()); return LLDB_INVALID_ADDRESS; } - addr_t link_map = it->second; + addr_t link_map = link_map_addr_opt.value(); if (link_map == LLDB_INVALID_ADDRESS || link_map == 0) { LLDB_LOGF(log, "GetThreadLocalData error: invalid link map address=0x%" PRIx64, diff --git a/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.h b/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.h index bde334aaca40b..6efb92673a13c 100644 --- a/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.h +++ b/lldb/source/Plugins/DynamicLoader/POSIX-DYLD/DynamicLoaderPOSIXDYLD.h @@ -93,10 +93,6 @@ class DynamicLoaderPOSIXDYLD : public lldb_private::DynamicLoader { /// Contains the pointer to the interpret module, if loaded. std::weak_ptr<lldb_private::Module> m_interpreter_module; - /// Loaded module list. (link map for each module) - std::map<lldb::ModuleWP, lldb::addr_t, std::owner_less<lldb::ModuleWP>> - m_loaded_modules; - /// Returns true if the process is for a core file. bool IsCoreFile() const; @@ -180,6 +176,19 @@ class DynamicLoaderPOSIXDYLD : public lldb_private::DynamicLoader { DynamicLoaderPOSIXDYLD(const DynamicLoaderPOSIXDYLD &) = delete; const DynamicLoaderPOSIXDYLD & operator=(const DynamicLoaderPOSIXDYLD &) = delete; + + /// Loaded module list. (link map for each module) + /// This may be accessed in a multi-threaded context. Use the accessor methods + /// to access `m_loaded_modules` safely. + std::map<lldb::ModuleWP, lldb::addr_t, std::owner_less<lldb::ModuleWP>> + m_loaded_modules; + llvm::sys::RWMutex m_loaded_modules_rw_mutex; + + void SetLoadedModule(const lldb::ModuleSP &module_sp, + lldb::addr_t link_map_addr); + void UnloadModule(const lldb::ModuleSP &module_sp); + std::optional<lldb::addr_t> + GetLoadedModuleLinkAddr(const lldb::ModuleSP &module_sp); }; #endif // LLDB_SOURCE_PLUGINS_DYNAMICLOADER_POSIX_DYLD_DYNAMICLOADERPOSIXDYLD_H diff --git a/lldb/source/Target/Target.cpp b/lldb/source/Target/Target.cpp index c26bca546891e..09c0c0b8a5db0 100644 --- a/lldb/source/Target/Target.cpp +++ b/lldb/source/Target/Target.cpp @@ -4488,6 +4488,12 @@ llvm::StringRef TargetProperties::GetLaunchWorkingDirectory() const { idx, g_target_properties[idx].default_cstr_value); } +bool TargetProperties::GetParallelModuleLoad() const { + const uint32_t idx = ePropertyParallelModuleLoad; + return GetPropertyAtIndexAs<bool>( + idx, g_target_properties[idx].default_uint_value != 0); +} + const char *TargetProperties::GetDisassemblyFlavor() const { const uint32_t idx = ePropertyDisassemblyFlavor; const char *return_value; diff --git a/lldb/source/Target/TargetProperties.td b/lldb/source/Target/TargetProperties.td index 38a345dfd8849..3940ac00a2bd9 100644 --- a/lldb/source/Target/TargetProperties.td +++ b/lldb/source/Target/TargetProperties.td @@ -217,6 +217,9 @@ let Definition = "target" in { "launched. If you change this setting, the new value will only apply to " "subsequent launches. Commands that take an explicit working directory " "will override this setting.">; + def ParallelModuleLoad: Property<"parallel-module-load", "Boolean">, + DefaultTrue, + Desc<"Enable loading of modules in parallel for the dynamic loader.">; } let Definition = "process_experimental" in { _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits