clayborg added a comment. So I know the mach-o symbol table parsing code is a mess already, but it seems like this patch can be simpler if we make a std::set<lldb:addr_t> at the top of ObjectFileMachO::ParseSymtab() and every time we add a symbol that has a valid address value, add the file addr to this set. This will avoid the need to do any sorting. This std::set can be used to not add LC_FUNCTION_START entries that already have a symbol at the address, and it can be used to only add TrieEntry values that have symbols. Thoughts?
================ Comment at: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:1924 + // symbol table, and this entry should not be added. + void SymbolAlreadyPresent() { flags = LLDB_INVALID_ADDRESS; } + bool IsSymbolAlreadyPresent() { ---------------- The only bits that are used in this field are: ``` enum { EXPORT_SYMBOL_FLAGS_KIND_MASK = 0x03u, EXPORT_SYMBOL_FLAGS_WEAK_DEFINITION = 0x04u, EXPORT_SYMBOL_FLAGS_REEXPORT = 0x08u, EXPORT_SYMBOL_FLAGS_STUB_AND_RESOLVER = 0x10u }; enum ExportSymbolKind { EXPORT_SYMBOL_FLAGS_KIND_REGULAR = 0x00u, EXPORT_SYMBOL_FLAGS_KIND_THREAD_LOCAL = 0x01u, EXPORT_SYMBOL_FLAGS_KIND_ABSOLUTE = 0x02u }; ``` So why not just set the highest bit and avoid clobbering all of the other flags?: ``` #define EXPORT_SYMBOL_FLAGS_LLDB_ALREADY_PRESENT 1ull << 63 void SymbolAlreadyPresent() { flags |= EXPORT_SYMBOL_FLAGS_LLDB_ALREADY_PRESENT; } ``` ================ Comment at: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:1926 + bool IsSymbolAlreadyPresent() { + if (flags == LLDB_INVALID_ADDRESS) + return true; ---------------- jingham wrote: > JDevlieghere wrote: > > `return flags == LLDB_INVALID_ADDRESS` > This sort of change has the downside that you can't break when flags == > LLDB_INVALID_ADDRESS without adding a condition. That seems in this case > like a condition you are likely to want to break on, and given this looks > like a function that gets called a lot, it's probably good to pay the cost of > a condition. > > I'm not sure I'm in favor of this sort of rewrite. It just saves one line, > and isn't particularly easier to read. Does it have some benefit I'm > missing? So is "flags" initially used just after parsing, but before we mark TrieEntry values as already present? These flags mean something when we first parse them. I would rather add an extra bool entry to this structure, since they don't stay around for long. ================ Comment at: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:1926 + bool IsSymbolAlreadyPresent() { + if (flags == LLDB_INVALID_ADDRESS) + return true; ---------------- clayborg wrote: > jingham wrote: > > JDevlieghere wrote: > > > `return flags == LLDB_INVALID_ADDRESS` > > This sort of change has the downside that you can't break when flags == > > LLDB_INVALID_ADDRESS without adding a condition. That seems in this case > > like a condition you are likely to want to break on, and given this looks > > like a function that gets called a lot, it's probably good to pay the cost > > of a condition. > > > > I'm not sure I'm in favor of this sort of rewrite. It just saves one line, > > and isn't particularly easier to read. Does it have some benefit I'm > > missing? > So is "flags" initially used just after parsing, but before we mark TrieEntry > values as already present? These flags mean something when we first parse > them. I would rather add an extra bool entry to this structure, since they > don't stay around for long. ``` return (flags & EXPORT_SYMBOL_FLAGS_LLDB_ALREADY_PRESENT) != 0; ``` ================ Comment at: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:1965 + // Terminal node -- end of a branch, possibly add this to + // the symbol table or resoler table. const uint64_t terminalSize = data.GetULEB128(&offset); ---------------- s/resoler/resolver/ ================ Comment at: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:2503-2516 ConstString text_segment_name("__TEXT"); SectionSP text_segment_sp = GetSectionList()->FindSectionByName(text_segment_name); if (text_segment_sp) { const lldb::addr_t text_segment_file_addr = text_segment_sp->GetFileAddress(); if (text_segment_file_addr != LLDB_INVALID_ADDRESS) { ---------------- Maybe we should run this before ParseTrieEntries and pass text_segment_file_addr in as an argument and add this to the offset as we parse the TrieEntry objects? ================ Comment at: lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:4490 + // Look through the current symbol table, if any symbols match + // an entry in the external_sym_trie_entries vector, mark it ---------------- Might be easier to create a std::set<lldb::addr_t> at the top of the ObjectFileMachO::ParseSymtab() function. Anyone who adds a symbol, would add the file address to that set. Then we wouldn't have to sort these entries, we would just iterator through them. We must do something like this for LC_FUNCTION_STARTS already right? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76758/new/ https://reviews.llvm.org/D76758 _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits