clayborg added a comment.
Is there a link to some documentation that explains what DWZ does? I found a
bunch of blurbs and man pages, but nothing useful. Nothing is found when I look
for "DWZ" in either the DWARF 4 or DWARF 5 spec.
================
Comment at: source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp:595
+ uint64_t debug_info_size = get_debug_info_data().GetByteSize();
+ data_segment.m_data.OffsetData(debug_info_size);
+ }
----------------
jankratochvil wrote:
> I do not like this `DWARFDataExtractor::m_start` modification, it sort of
> corrupts the `DataExtractor` and various operations stop working then - such
> as `DWARFDataExtractor::GetByteSize()`. DWZ patch makes from current
> `dw_offset_t` a virtual (remapped) offset and introduces new physical file
> section offset which is looked up for data extraction. The file offset is
> represented as `DWARFFileOffset` in D40474, instead of `bool m_is_dwz;` there
> could be some `enum { DEBUG_INFO, DEBUG_TYPES, DWZ_DEBUG_INFO } m_where;`
> instead.
This means that this diff doesn't affect all of the other DWARF code. Nothing
in .debug_types will refer to anything else (not DW_FORM_ref_addr, or any
external references). So this trick allows us to just treat .debug_info as if
.debug_types was appended to the end since nothing in .debug_types refers to
any DIE outside of its type unit. This also mirrors what will actually happen
with DWARF5 when all of the data is contained inside of the .debug_info
section. This allows each DIE to have a unique "ID". Any other change requires
a lot of changes to the DWARF parser and logic. So I actually like this
feature. We can fix the GetByteSize() if needed. Basically every object in
DWARf right now must be able to make a unique 64 bit unsigned integer ID in
some way that we can get back to that info and partially parse more. These are
handed out as lldb::user_id_t values for types, functions and more. Each flavor
of DWARF will encode what they want into here. The normal DWARF it is just the
absolute offset within the .debug_info. With .debug_types we just add the size
of the .debug_info to the ID. For DWARF in .o files on Darwin, we encode the
compile unit index into the top 32 bits and the DIE offset into the lower, DWO
does something just as DWZ will need to. DWARFFileOffset doesn't mean much if
there are multiple files. We have many competing type uniquing/debug info size
reduction strategies being employed here. I can't believe we have DWO, DWZ, and
debug types... But we have to make them all work. We can't just use the
absolute file offset because DWO used external files where the file offsets
could be the same in the external .o files... Not sure how this works with DWZ
or what the best option is. I will read up on DWZ so I can propose some viable
options. But each new flavor of the day that gets added the DWARF parser is
adding a bunch of logic and edge cases. If two technologies (DWZ + DWO, DWZ +
debug_types, etc) are used together, we need to ensure they can.
https://reviews.llvm.org/D32167
_______________________________________________
lldb-commits mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits