Please do submit this for review, yes. > On May 11, 2016, at 1:27 PM, Philippe Lavoie <philippe.lav...@octasic.com> > wrote: > > Thanks for the explanation and the patch. > > I tried it and the exception is gone. > The debug info also seems complete. As complete as can be for an LTO build... > > I attached a modified patch with the calls to ExtractDIEsIfNeeded() on > separate threads. > > Should I submit it for review ? > > -Philippe > > From: Greg Clayton [gclay...@apple.com] > Sent: Tuesday, May 03, 2016 5:02 PM > To: Philippe Lavoie > Cc: lldb-dev@lists.llvm.org > Subject: Re: [lldb-dev] LTO debug info > > > > On May 3, 2016, at 6:35 AM, Philippe Lavoie <philippe.lav...@octasic.com> > > wrote: > > > > The code accessing another CU while indexing is in DWARFCompileUnit::GetDIE. > > If the DIE is not in the current CU, it finds the CU that has it and calls > > its ::GetDIE method, accessing its m_die_array data... > > > > So the question is: is it a producer (CU should be self-contained) or a > > consumer (the indexing) bug ? > > The indexing needs to be fixed. We probably need to start checking the > DW_FORM of an attribute before we try to get the DIE for it. So the problem > seems to be in DWARFCompileUnit::IndexPrivate() where when the tag is a > DW_TAG_subprogram it eventually does: > > if (specification_die_form.IsValid()) > { > DWARFDIE specification_die = > dwarf_cu->GetSymbolFileDWARF()->DebugInfo()->GetDIE > (DIERef(specification_die_form)); > if (specification_die.GetParent().IsStructOrClass()) > is_method = true; > } > > This is the only call to GetDIE in the entire function. It is getting the DIE > so it can look to see who the parent it to see if the parent of the function > is a struct/class, which means it is a method, or anything else and the > function is just the basename of a raw function. So it seems like this should > be OK to access other DIEs as it isn't adding any information from the other > DWARF file, it is just using it to check who the parent of function > (DW_TAG_subprogram) is. > > So the real problem is that "ClearDIEs()" is being called too soon before > everyone is done using the DIEs. The reason this was being done is that you > might have 10000 compile units and we want to partially parse the the DWARF > and only use what we need, but if we index everything we want to load the > DIEs for a compile unit and clear them after indexing if they weren't loaded > before we did the index so we don't take up loads of memory. > > So what we need to do to fix this is: > > 1 - Run through all of the compile units first and see if each CU has the > DIEs parsed and remember this > 2 - Index all compile units > 3 - Clear any DIEs in any compile units that didn't have their DIEs parsed > _after_ all compile units have been indexed > > A proposed patch is attached: > > > > the main issue is now all DWARF is parsed on the current thread. So this > patch should be updated to run all of the ExtractDIEsIfNeeded() functions on > separate threads since this can be time consuming: > > > //---------------------------------------------------------------------- > // First figure out which compile units didn't have their DIEs already > // parsed and remember this. If no DIEs were parsed prior to this > index > // function call, we are going to want to clear the CU dies after we > // are done indexing to make sure we don't pull in all DWARF dies, but > // we need to wait until all compile units have been indexed in case > // a DIE in one compile unit refers to another and the indexes > accesses > // those DIEs. > > //---------------------------------------------------------------------- > for (uint32_t cu_idx = 0; cu_idx < num_compile_units; ++cu_idx) > { > DWARFCompileUnit* dwarf_cu = > debug_info->GetCompileUnitAtIndex(cu_idx); > if (dwarf_cu) > { > // dwarf_cu->ExtractDIEsIfNeeded(false) will return zero if > the > // DIEs for a compile unit have already been parsed. > clear_cu_dies[cu_idx] = dwarf_cu->ExtractDIEsIfNeeded(false) > > 1; > } > } > > > > > //---------------------------------------------------------------------- > > // GetDIE() > > // > > // Get the DIE (Debug Information Entry) with the specified offset by > > // first checking if the DIE is contained within this compile unit and > > // grabbing the DIE from this compile unit. Otherwise we grab the DIE > > // from the DWARF file. > > //---------------------------------------------------------------------- > > DWARFDIE > > DWARFCompileUnit::GetDIE (dw_offset_t die_offset) > > { > > if (die_offset != DW_INVALID_OFFSET) > > { > > if (m_dwo_symbol_file) > > return m_dwo_symbol_file->GetCompileUnit()->GetDIE(die_offset); > > > > if (ContainsDIEOffset(die_offset)) > > { > > ExtractDIEsIfNeeded (false); > > DWARFDebugInfoEntry::iterator end = m_die_array.end(); > > DWARFDebugInfoEntry::iterator pos = > > lower_bound(m_die_array.begin(), end, die_offset, CompareDIEOffset); > > if (pos != end) > > { > > if (die_offset == (*pos).GetOffset()) > > return DWARFDIE(this, &(*pos)); > > } > > } > > else > > { > > // Don't specify the compile unit offset as we don't know it > > because the DIE belongs to > > // a different compile unit in the same symbol file. > > return m_dwarf2Data->DebugInfo()->GetDIEForDIEOffset(die_offset); > > } > > } > > return DWARFDIE(); // Not found > > } > > > > ________________________________________ > > From: lldb-dev [lldb-dev-boun...@lists.llvm.org] on behalf of Philippe > > Lavoie via lldb-dev [lldb-dev@lists.llvm.org] > > Sent: Friday, April 29, 2016 3:27 PM > > To: Greg Clayton > > Cc: lldb-dev@lists.llvm.org > > Subject: Re: [lldb-dev] LTO debug info > > > >> So the main question is why is anything that is indexing the current CU > >> only, accessing anything from another CU? It can't and shouldn't. That is > >> the bug. > > > > Indexing is accessing another CU because there is a reference (offset) to > > another CU. > > > > For example, in the 2nd compile unit below, the DW_AT_type of > > DW_TAG_formal_parameter at offset 0x1772 is referencing the 1st compile > > unit (offset 0x1741). > > > > Compilation Unit @ offset 0x16f9: > > Length: 0x4c (32-bit) > > Version: 4 > > Abbrev Offset: 0x763 > > Pointer Size: 4 > > <0><1704>: Abbrev Number: 1 (DW_TAG_compile_unit) > > <1705> DW_AT_producer : (indirect string, offset: 0x0): clang > > version 3.9.0 (trunk) > > <1709> DW_AT_language : 12 (ANSI C99) > > <170b> DW_AT_name : (indirect string, offset: 0x4be): main.c > > <170f> DW_AT_stmt_list : 0x1329 > > <1713> DW_AT_comp_dir : (indirect string, offset: 0x4c5): > > C:\Users\phlav\Documents\opus studio 2013\Projects\OpusProject9 > > <1717> Unknown AT value: 3fe1: 1 > > <1717> DW_AT_low_pc : 0x400000 > > <171b> DW_AT_high_pc : 0x8 > > <1><171f>: Abbrev Number: 2 (DW_TAG_subprogram) > > <1720> DW_AT_low_pc : 0x400000 > > <1724> DW_AT_high_pc : 0x8 > > <1728> Unknown AT value: 3fe7: 1 > > <1728> DW_AT_frame_base : 1 byte block: 51 (DW_OP_reg1 (r1)) > > <172a> DW_AT_name : (indirect string, offset: 0xab): main > > <172e> DW_AT_decl_file : 1 > > <172f> DW_AT_decl_line : 4 > > <1730> DW_AT_type : <0x1741> > > <1734> DW_AT_external : 1 > > <1734> Unknown AT value: 3fe1: 1 > > <2><1734>: Abbrev Number: 3 (DW_TAG_variable) > > <1735> DW_AT_const_value : 55 > > <1736> DW_AT_name : (indirect string, offset: 0xd9): x > > <173a> DW_AT_decl_file : 1 > > <173b> DW_AT_decl_line : 7 > > <173c> DW_AT_type : <0x1741> > > <1><1741>: Abbrev Number: 4 (DW_TAG_base_type) > > <1742> DW_AT_name : (indirect string, offset: 0x8f): int > > <1746> DW_AT_encoding : 5 (signed) > > <1747> DW_AT_byte_size : 4 > > > > Compilation Unit @ offset 0x1749: > > Length: 0x32 (32-bit) > > Version: 4 > > Abbrev Offset: 0x763 > > Pointer Size: 4 > > <0><1754>: Abbrev Number: 5 (DW_TAG_compile_unit) > > <1755> DW_AT_producer : (indirect string, offset: 0x0): clang > > version 3.9.0 (trunk) > > <1759> DW_AT_language : 12 (ANSI C99) > > <175b> DW_AT_name : (indirect string, offset: 0x505): Source1.c > > <175f> DW_AT_stmt_list : 0x1362 > > <1763> DW_AT_comp_dir : (indirect string, offset: 0x50f): > > C:\Users\phlav\Documents\opus studio 2013\Projects\OpusLibrary > > <1767> Unknown AT value: 3fe1: 1 > > <1><1767>: Abbrev Number: 6 (DW_TAG_subprogram) > > <1768> DW_AT_name : (indirect string, offset: 0x54e): lib > > <176c> DW_AT_decl_file : 1 > > <176d> DW_AT_decl_line : 3 > > <176e> DW_AT_prototyped : 1 > > <176e> DW_AT_type : <0x1741> > > <1772> DW_AT_external : 1 > > <1772> Unknown AT value: 3fe1: 1 > > <2><1772>: Abbrev Number: 7 (DW_TAG_formal_parameter) > > <1773> DW_AT_name : (indirect string, offset: 0xd9): x > > <1777> DW_AT_decl_file : 1 > > <1778> DW_AT_decl_line : 3 > > <1779> DW_AT_type : <0x1741> > > > > > > > > ________________________________________ > > From: Greg Clayton [gclay...@apple.com] > > Sent: Friday, April 29, 2016 2:02 PM > > To: Philippe Lavoie > > Cc: lldb-dev@lists.llvm.org > > Subject: Re: [lldb-dev] LTO debug info > > > >> On Apr 28, 2016, at 8:11 AM, Philippe Lavoie <philippe.lav...@octasic.com> > >> wrote: > >> > >> > >> What made me suspect a data corruption issue were asserts triggered in the > >> VC++ vector implementation when we use an LTO binary with a DEBUG lldb > >> build on Windows. > >> > >> For example, > >> at source/Plugins/SymbolFile/DWARF/DWARFCompileUnit.cpp:624, > >> > >> File: C:\Program Files (x86)\Microsoft Visual Studio > >> 12.0\VC\INCLUDE\vector > >> Line: 240 > >> Expression: vector iterators incompatible > >> > >> Digging deeper, the assertion is triggered by accesses to > >> DWARFCompileUnit::m_die_array > >> > >> In the 'parser_fn' lambda in SymbolFileDWARF::Index() that indexes each > >> compile unit in parallel, DWARFCompileUnit::ExtractDIEsIfNeeded(..) , > >> DWARFCompileUnit::Index(..) and DWARFCompileUnit::ClearDIEs() are called > >> in sequence. > >> > >> 'ExtractDIEsIfNeeded' and 'ClearDIEs' respectively populates and clears > >> DWARFCompileUnit::m_die_array. > >> > >> But when DWARFCompileUnit::Index(..) looks up a DIE in another CU and that > >> CU is running 'ExtractDIEsIfNeeded' or 'ClearDIEs', it will access a stale > >> or invalid DWARFCompileUnit::m_die_array ... > > > > So the main question is why is anything that is indexing the current CU > > only, accessing anything from another CU? It can't and shouldn't. That is > > the bug. > > > >> -Philippe > >> > >> ________________________________________ > >> From: Greg Clayton [gclay...@apple.com] > >> Sent: Monday, April 25, 2016 7:59 PM > >> To: Philippe Lavoie > >> Cc: lldb-dev@lists.llvm.org > >> Subject: Re: [lldb-dev] LTO debug info > >> > >>> On Apr 25, 2016, at 1:37 PM, Philippe Lavoie via lldb-dev > >>> <lldb-dev@lists.llvm.org> wrote: > >>> > >>> Hello, > >>> > >>> We noticed an issue with dwarf parsing when debugging a large application > >>> compiled with LTO. > >>> > >>> Since an LTO application's debug info can have inter-compile-unit > >>> references, (for example: a type defined in one compile-unit and used in > >>> another) we noticed that the "m_type_index" vector in SymbolFileDWARF can > >>> get corrupted. This is because many index structures in SymbolFileDWARF > >>> are generated in parallel. > >>> > >>> Is is a bug or is it expected to be the compiler's/linker's job to > >>> generate self-contained DWARF data without inter-CU references? > >>> > >>> -Philippe > >> > >> TEach SymbolFileDWARF should generate its own accelerator tables in a self > >> contained way. No SymbolFileDWARF should be adding references to their > >> accelerator maps that actually refer to DIEs from another SymbolFileDWARF. > >> We might need to look at the indexing function that make sure if they see > >> and DW_FORM_ref_addr attributes, that they don't add any of them to their > >> index. That seems like the bug? > >> > >> Greg > >> > > > > _______________________________________________ > > lldb-dev mailing list > > lldb-dev@lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev > > <dwarf.patch>
_______________________________________________ lldb-dev mailing list lldb-dev@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev