jankratochvil added inline comments.
================
Comment at: source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp:595
+ uint64_t debug_info_size = get_debug_info_data().GetByteSize();
+ data_segment.m_data.OffsetData(debug_info_size);
+ }
----------------
jankratochvil wrote:
> clayborg wrote:
> > jankratochvil wrote:
> > > I do not like this `DWARFDataExtractor::m_start` modification, it sort of
> > > corrupts the `DataExtractor` and various operations stop working then -
> > > such as `DWARFDataExtractor::GetByteSize()`. DWZ patch makes from current
> > > `dw_offset_t` a virtual (remapped) offset and introduces new physical
> > > file section offset which is looked up for data extraction. The file
> > > offset is represented as `DWARFFileOffset` in D40474, instead of `bool
> > > m_is_dwz;` there could be some `enum { DEBUG_INFO, DEBUG_TYPES,
> > > DWZ_DEBUG_INFO } m_where;` instead.
> > This means that this diff doesn't affect all of the other DWARF code.
> > Nothing in .debug_types will refer to anything else (not DW_FORM_ref_addr,
> > or any external references). So this trick allows us to just treat
> > .debug_info as if .debug_types was appended to the end since nothing in
> > .debug_types refers to any DIE outside of its type unit. This also mirrors
> > what will actually happen with DWARF5 when all of the data is contained
> > inside of the .debug_info section. This allows each DIE to have a unique
> > "ID". Any other change requires a lot of changes to the DWARF parser and
> > logic. So I actually like this feature. We can fix the GetByteSize() if
> > needed. Basically every object in DWARf right now must be able to make a
> > unique 64 bit unsigned integer ID in some way that we can get back to that
> > info and partially parse more. These are handed out as lldb::user_id_t
> > values for types, functions and more. Each flavor of DWARF will encode what
> > they want into here. The normal DWARF it is just the absolute offset within
> > the .debug_info. With .debug_types we just add the size of the .debug_info
> > to the ID. For DWARF in .o files on Darwin, we encode the compile unit
> > index into the top 32 bits and the DIE offset into the lower, DWO does
> > something just as DWZ will need to. DWARFFileOffset doesn't mean much if
> > there are multiple files. We have many competing type uniquing/debug info
> > size reduction strategies being employed here. I can't believe we have DWO,
> > DWZ, and debug types... But we have to make them all work. We can't just
> > use the absolute file offset because DWO used external files where the file
> > offsets could be the same in the external .o files... Not sure how this
> > works with DWZ or what the best option is. I will read up on DWZ so I can
> > propose some viable options. But each new flavor of the day that gets added
> > the DWARF parser is adding a bunch of logic and edge cases. If two
> > technologies (DWZ + DWO, DWZ + debug_types, etc) are used together, we need
> > to ensure they can.
> > Any other change requires a lot of changes to the DWARF parser and logic.
> > So I actually like this feature.
>
> I agree it is a fine quick&dirty hack. Just if my DWZ support gets accepted
> later anyway then this `.debug_types` feature could be implemented by its
> framework in a clean way (as a regular DIEs remapping which is required for
> DWZ anyway).
>
> > If two technologies (DWZ + DWO, DWZ + debug_types, etc) are used together,
> > we need to ensure they can.
>
> `DWZ + DWO` do not make sense to me. I haven't tried to use DWZ for DWO but
> DWZ finds common DWARF subtrees typically across CUs so it would not find
> much.
>
> `DWZ + debug_types` is explicitly supported by the DWZ tool although that is
> IMO for compatibility only, DWZ can make the common type references slightly
> smaller than debug_types. I will sure need to implement debug_types support
> into my DWZ-for-LLDB patchset later.
>
> For DWARF in .o files on Darwin, we encode the compile unit index into the
> top 32 bits and the DIE offset into the lower
BTW that prevents implementing DWARF64 so I wanted to prevent using the 32/32
split. Currently Fedora/RHEL already contains files with `.debug_info` size
near the 32-bit limit:
```http://ftp.muni.cz/pub/linux/fedora/linux/development/rawhide/Everything/x86_64/debug/tree/Packages/q/qt5-qtwebkit-debuginfo-5.212.0-0.20.alpha2.fc29.x86_64.rpm
/usr/lib/debug/usr/lib64/libQt5WebKit.so.5.212.0-5.212.0-0.20.alpha2.fc29.x86_64.debug
.debug_info size = 0x9bff1b08 = 2.4GiB
```
Although that is because neither DWZ (as it would run out of memory) nor
`.debug_types` (as it is expected DWZ will handle the duplicities) are applied
and so all CUs are separate and even if it was >4GB there would still be no
need for 64-bit `DW_FORM_ref_addr`. Still we are already close to the 32-bit
limit, also shifting the split for example to 40/24 would make some limit on
number of CUs etc.
https://reviews.llvm.org/D32167
_______________________________________________
lldb-commits mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits