Please do submit this for review, yes.

> On May 11, 2016, at 1:27 PM, Philippe Lavoie <philippe.lav...@octasic.com> 
> wrote:
> 
> Thanks for the explanation and the patch.
> 
> I tried it and the exception is gone.
> The debug info also seems complete. As complete as can be for an LTO build...
> 
> I attached a modified patch with the calls to ExtractDIEsIfNeeded() on 
> separate threads.
> 
> Should I submit it for review ?
> 
> -Philippe
> 
> From: Greg Clayton [gclay...@apple.com]
> Sent: Tuesday, May 03, 2016 5:02 PM
> To: Philippe Lavoie
> Cc: lldb-dev@lists.llvm.org
> Subject: Re: [lldb-dev] LTO debug info
> 
> 
> > On May 3, 2016, at 6:35 AM, Philippe Lavoie <philippe.lav...@octasic.com> 
> > wrote:
> > 
> > The code accessing another CU while indexing is in DWARFCompileUnit::GetDIE.
> > If the DIE is not in the current CU, it finds the CU that has it and calls 
> > its ::GetDIE method, accessing its m_die_array data... 
> > 
> > So the question is: is it a producer (CU should be self-contained) or a 
> > consumer (the indexing) bug ?
> 
> The indexing needs to be fixed. We probably need to start checking the 
> DW_FORM of an attribute before we try to get the DIE for it. So the problem 
> seems to be in DWARFCompileUnit::IndexPrivate() where when the tag is a 
> DW_TAG_subprogram it eventually does:
> 
> if (specification_die_form.IsValid())
> {
>     DWARFDIE specification_die = 
> dwarf_cu->GetSymbolFileDWARF()->DebugInfo()->GetDIE 
> (DIERef(specification_die_form));
>     if (specification_die.GetParent().IsStructOrClass())
>         is_method = true;
> }
> 
> This is the only call to GetDIE in the entire function. It is getting the DIE 
> so it can look to see who the parent it to see if the parent of the function 
> is a struct/class, which means it is a method, or anything else and the 
> function is just the basename of a raw function. So it seems like this should 
> be OK to access other DIEs as it isn't adding any information from the other 
> DWARF file, it is just using it to check who the parent of function 
> (DW_TAG_subprogram) is.
> 
> So the real problem is that "ClearDIEs()" is being called too soon before 
> everyone is done using the DIEs. The reason this was being done is that you 
> might have 10000 compile units and we want to partially parse the the DWARF 
> and only use what we need, but if we index everything we want to load the 
> DIEs for a compile unit and clear them after indexing if they weren't loaded 
> before we did the index so we don't take up loads of memory. 
> 
> So what we need to do to fix this is:
> 
> 1 - Run through all of the compile units first and see if each CU has the 
> DIEs parsed and remember this
> 2 - Index all compile units
> 3 - Clear any DIEs in any compile units that didn't have their DIEs parsed 
> _after_ all compile units have been indexed
> 
> A proposed patch is attached:
> 
> 
> 
> the main issue is now all DWARF is parsed on the current thread. So this 
> patch should be updated to run all of the ExtractDIEsIfNeeded() functions on 
> separate threads since this can be time consuming:
> 
>         
> //----------------------------------------------------------------------
>         // First figure out which compile units didn't have their DIEs already
>         // parsed and remember this.  If no DIEs were parsed prior to this 
> index
>         // function call, we are going to want to clear the CU dies after we
>         // are done indexing to make sure we don't pull in all DWARF dies, but
>         // we need to wait until all compile units have been indexed in case
>         // a DIE in one compile unit refers to another and the indexes 
> accesses
>         // those DIEs.
>         
> //----------------------------------------------------------------------
>         for (uint32_t cu_idx = 0; cu_idx < num_compile_units; ++cu_idx)
>         {
>             DWARFCompileUnit* dwarf_cu = 
> debug_info->GetCompileUnitAtIndex(cu_idx);
>             if (dwarf_cu)
>             {
>                 // dwarf_cu->ExtractDIEsIfNeeded(false) will return zero if 
> the
>                 // DIEs for a compile unit have already been parsed.
>                 clear_cu_dies[cu_idx] = dwarf_cu->ExtractDIEsIfNeeded(false) 
> > 1;
>             }
>         }
> 
> > 
> > //----------------------------------------------------------------------
> > // GetDIE()
> > //
> > // Get the DIE (Debug Information Entry) with the specified offset by
> > // first checking if the DIE is contained within this compile unit and
> > // grabbing the DIE from this compile unit. Otherwise we grab the DIE
> > // from the DWARF file.
> > //----------------------------------------------------------------------
> > DWARFDIE
> > DWARFCompileUnit::GetDIE (dw_offset_t die_offset)
> > {
> >    if (die_offset != DW_INVALID_OFFSET)
> >    {
> >        if (m_dwo_symbol_file)
> >            return m_dwo_symbol_file->GetCompileUnit()->GetDIE(die_offset);
> > 
> >        if (ContainsDIEOffset(die_offset))
> >        {
> >            ExtractDIEsIfNeeded (false);
> >            DWARFDebugInfoEntry::iterator end = m_die_array.end();
> >            DWARFDebugInfoEntry::iterator pos = 
> > lower_bound(m_die_array.begin(), end, die_offset, CompareDIEOffset);
> >            if (pos != end)
> >            {
> >                if (die_offset == (*pos).GetOffset())
> >                    return DWARFDIE(this, &(*pos));
> >            }
> >        }
> >        else
> >        {
> >            // Don't specify the compile unit offset as we don't know it 
> > because the DIE belongs to
> >            // a different compile unit in the same symbol file.
> >            return m_dwarf2Data->DebugInfo()->GetDIEForDIEOffset(die_offset);
> >        }
> >    }
> >    return DWARFDIE(); // Not found
> > }
> > 
> > ________________________________________
> > From: lldb-dev [lldb-dev-boun...@lists.llvm.org] on behalf of Philippe 
> > Lavoie via lldb-dev [lldb-dev@lists.llvm.org]
> > Sent: Friday, April 29, 2016 3:27 PM
> > To: Greg Clayton
> > Cc: lldb-dev@lists.llvm.org
> > Subject: Re: [lldb-dev] LTO debug info
> > 
> >> So the main question is why is anything that is indexing the current CU 
> >> only, accessing anything from another CU? It can't and shouldn't. That is 
> >> the bug.
> > 
> > Indexing is accessing another CU because there is a reference (offset) to 
> > another CU.
> > 
> > For example, in the 2nd compile unit below, the DW_AT_type of 
> > DW_TAG_formal_parameter at offset 0x1772 is referencing the 1st compile 
> > unit (offset 0x1741).
> > 
> > Compilation Unit @ offset 0x16f9:
> >   Length:        0x4c (32-bit)
> >   Version:       4
> >   Abbrev Offset: 0x763
> >   Pointer Size:  4
> > <0><1704>: Abbrev Number: 1 (DW_TAG_compile_unit)
> >    <1705>   DW_AT_producer    : (indirect string, offset: 0x0): clang 
> > version 3.9.0 (trunk)
> >    <1709>   DW_AT_language    : 12     (ANSI C99)
> >    <170b>   DW_AT_name        : (indirect string, offset: 0x4be): main.c
> >    <170f>   DW_AT_stmt_list   : 0x1329
> >    <1713>   DW_AT_comp_dir    : (indirect string, offset: 0x4c5): 
> > C:\Users\phlav\Documents\opus studio 2013\Projects\OpusProject9
> >    <1717>   Unknown AT value: 3fe1: 1
> >    <1717>   DW_AT_low_pc      : 0x400000
> >    <171b>   DW_AT_high_pc     : 0x8
> > <1><171f>: Abbrev Number: 2 (DW_TAG_subprogram)
> >    <1720>   DW_AT_low_pc      : 0x400000
> >    <1724>   DW_AT_high_pc     : 0x8
> >    <1728>   Unknown AT value: 3fe7: 1
> >    <1728>   DW_AT_frame_base  : 1 byte block: 51       (DW_OP_reg1 (r1))
> >    <172a>   DW_AT_name        : (indirect string, offset: 0xab): main
> >    <172e>   DW_AT_decl_file   : 1
> >    <172f>   DW_AT_decl_line   : 4
> >    <1730>   DW_AT_type        : <0x1741>
> >    <1734>   DW_AT_external    : 1
> >    <1734>   Unknown AT value: 3fe1: 1
> > <2><1734>: Abbrev Number: 3 (DW_TAG_variable)
> >    <1735>   DW_AT_const_value : 55
> >    <1736>   DW_AT_name        : (indirect string, offset: 0xd9): x
> >    <173a>   DW_AT_decl_file   : 1
> >    <173b>   DW_AT_decl_line   : 7
> >    <173c>   DW_AT_type        : <0x1741>
> > <1><1741>: Abbrev Number: 4 (DW_TAG_base_type)
> >    <1742>   DW_AT_name        : (indirect string, offset: 0x8f): int
> >    <1746>   DW_AT_encoding    : 5      (signed)
> >    <1747>   DW_AT_byte_size   : 4
> > 
> >  Compilation Unit @ offset 0x1749:
> >   Length:        0x32 (32-bit)
> >   Version:       4
> >   Abbrev Offset: 0x763
> >   Pointer Size:  4
> > <0><1754>: Abbrev Number: 5 (DW_TAG_compile_unit)
> >    <1755>   DW_AT_producer    : (indirect string, offset: 0x0): clang 
> > version 3.9.0 (trunk)
> >    <1759>   DW_AT_language    : 12     (ANSI C99)
> >    <175b>   DW_AT_name        : (indirect string, offset: 0x505): Source1.c
> >    <175f>   DW_AT_stmt_list   : 0x1362
> >    <1763>   DW_AT_comp_dir    : (indirect string, offset: 0x50f): 
> > C:\Users\phlav\Documents\opus studio 2013\Projects\OpusLibrary
> >    <1767>   Unknown AT value: 3fe1: 1
> > <1><1767>: Abbrev Number: 6 (DW_TAG_subprogram)
> >    <1768>   DW_AT_name        : (indirect string, offset: 0x54e): lib
> >    <176c>   DW_AT_decl_file   : 1
> >    <176d>   DW_AT_decl_line   : 3
> >    <176e>   DW_AT_prototyped  : 1
> >    <176e>   DW_AT_type        : <0x1741>
> >    <1772>   DW_AT_external    : 1
> >    <1772>   Unknown AT value: 3fe1: 1
> > <2><1772>: Abbrev Number: 7 (DW_TAG_formal_parameter)
> >    <1773>   DW_AT_name        : (indirect string, offset: 0xd9): x
> >    <1777>   DW_AT_decl_file   : 1
> >    <1778>   DW_AT_decl_line   : 3
> >    <1779>   DW_AT_type        : <0x1741>
> > 
> > 
> > 
> > ________________________________________
> > From: Greg Clayton [gclay...@apple.com]
> > Sent: Friday, April 29, 2016 2:02 PM
> > To: Philippe Lavoie
> > Cc: lldb-dev@lists.llvm.org
> > Subject: Re: [lldb-dev] LTO debug info
> > 
> >> On Apr 28, 2016, at 8:11 AM, Philippe Lavoie <philippe.lav...@octasic.com> 
> >> wrote:
> >> 
> >> 
> >> What made me suspect a data corruption issue were asserts triggered in the 
> >> VC++ vector implementation when we use an LTO binary with a DEBUG lldb 
> >> build on Windows.
> >> 
> >>  For example,
> >>  at source/Plugins/SymbolFile/DWARF/DWARFCompileUnit.cpp:624,
> >> 
> >>  File: C:\Program Files (x86)\Microsoft Visual Studio 
> >> 12.0\VC\INCLUDE\vector
> >>  Line: 240
> >>  Expression: vector iterators incompatible
> >> 
> >> Digging deeper, the assertion is triggered by accesses to 
> >> DWARFCompileUnit::m_die_array
> >> 
> >> In the 'parser_fn' lambda in SymbolFileDWARF::Index() that indexes each 
> >> compile unit in parallel, DWARFCompileUnit::ExtractDIEsIfNeeded(..) , 
> >> DWARFCompileUnit::Index(..) and DWARFCompileUnit::ClearDIEs() are called 
> >> in sequence.
> >> 
> >> 'ExtractDIEsIfNeeded' and 'ClearDIEs' respectively populates and clears 
> >> DWARFCompileUnit::m_die_array.
> >> 
> >> But when DWARFCompileUnit::Index(..) looks up a DIE in another CU and that 
> >> CU is running 'ExtractDIEsIfNeeded' or 'ClearDIEs', it will access a stale 
> >> or invalid DWARFCompileUnit::m_die_array ...
> > 
> > So the main question is why is anything that is indexing the current CU 
> > only, accessing anything from another CU? It can't and shouldn't. That is 
> > the bug.
> > 
> >> -Philippe
> >> 
> >> ________________________________________
> >> From: Greg Clayton [gclay...@apple.com]
> >> Sent: Monday, April 25, 2016 7:59 PM
> >> To: Philippe Lavoie
> >> Cc: lldb-dev@lists.llvm.org
> >> Subject: Re: [lldb-dev] LTO debug info
> >> 
> >>> On Apr 25, 2016, at 1:37 PM, Philippe Lavoie via lldb-dev 
> >>> <lldb-dev@lists.llvm.org> wrote:
> >>> 
> >>> Hello,
> >>> 
> >>> We noticed an issue with dwarf parsing when debugging a large application 
> >>> compiled with LTO.
> >>> 
> >>> Since an LTO application's debug info can have inter-compile-unit 
> >>> references, (for example: a type defined in one compile-unit and used in 
> >>> another) we noticed that the "m_type_index" vector in SymbolFileDWARF can 
> >>> get corrupted. This is because many index structures in SymbolFileDWARF 
> >>> are generated in parallel.
> >>> 
> >>> Is is a bug or is it expected to be the compiler's/linker's job to 
> >>> generate self-contained DWARF data without inter-CU references?
> >>> 
> >>> -Philippe
> >> 
> >> TEach SymbolFileDWARF should generate its own accelerator tables in a self 
> >> contained way. No SymbolFileDWARF should be adding references to their 
> >> accelerator maps that actually refer to DIEs from another SymbolFileDWARF. 
> >> We might need to look at the indexing function that make sure if they see 
> >> and DW_FORM_ref_addr attributes, that they don't add any of them to their 
> >> index. That seems like the bug?
> >> 
> >> Greg
> >> 
> > 
> > _______________________________________________
> > lldb-dev mailing list
> > lldb-dev@lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> 
> <dwarf.patch>

_______________________________________________
lldb-dev mailing list
lldb-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Reply via email to