On 2020-11-12 7:11 p.m., Mark Wielaard wrote: > Hi Simon, > > On Thu, Nov 05, 2020 at 11:11:43PM -0500, Simon Marchi wrote: >> I'm currently squashing some bugs related to .debug_rnglists in GDB, and >> I happened to notice that clang and gcc do different things when >> generating rnglists with split DWARF. I'd like to know if the two >> behaviors are acceptable, and therefore if we need to make GDB accept >> both. Or maybe one of them is not doing things correctly and would need >> to be fixed. >> >> clang generates a .debug_rnglists.dwo section in the .dwo file. Any >> DW_FORM_rnglistx attribute in the DWO refers to that section. That >> section is not shared with any other object, so DW_AT_rnglists_base is >> never involved for these attributes. Note that there might still be a >> DW_AT_rnglists_base on the DW_TAG_skeleton_unit, in the linked file, >> used if the skeleton itself has an attribute of form DW_FORM_rnglistx. >> This rnglist would be found in a .debug_rnglists section in the linked >> file, shared with the other units of the linked file. >> >> gcc generates a single .debug_rnglists in the linked file and no >> .debug_rnglists.dwo in the DWO files. So when an attribute has form >> DW_FORM_rnglistx in a DWO file, I presume we need to do the lookup in >> the .debug_rnglists section in the linked file, using the >> DW_AT_rnglists_base attribute found in the corresponding skeleton unit. >> This looks vaguely similar to how it was done pre-DWARF 5, with >> DW_AT_GNU_ranges base. >> >> So, is gcc wrong here? I don't see anything in the DWARF 5 spec >> prohibiting to do it like gcc does, but clang's way of doing it sounds >> more in-line with the intent of what's described in the DWARF 5 spec. >> So I wonder if it's maybe an oversight or a misunderstanding between the >> two compilers. > > I think I would have asked the question the other way around :) The > spec explicitly describes rnglists_base (and loclists_base) as a way > to reference ranges (loclists) through the index table, so that the > only relocation you need is in the (skeleton) DIE.
I presume you reference this non-normative text in section 2.17.3? This range list representation, the rnglist class, and the related DW_AT_rnglists_base attribute are new in DWARF Version 5. Together they eliminate most or all of the object language relocations previously needed for range lists. What I understand from this is that the rnglist class and DW_AT_rnglists_base attribute help reduce the number of relocations in the non-split case (it removes the need for relocations from DW_AT_ranges attribute values in .debug_info to .debug_rnglists). I don't understand it as saying anything about where to put the rnglist data in the split-unit case. > But the rnglists > (loclists) themselves can still use relocations. A large part of them > is non-shared addresses, so using indexes (into the .debug_addr > addr_base) would simply be extra overhead. The relocations will > disappear once linked, but the index tables won't. > > As an alternative, if you like to minimize the amount of debug data in > the main object file, the spec also describes how to put a whole > .debug_rnglists.dwo (or .debug_loclists.dwo) in the split dwarf > file. Then you cannot use all entry encodings and do need to use an > .debug_addr index to refer to any addresses in that case. So the > relocations are still there, you just refer to them through an extra > index indirection. > > So I believe both encodings are valid according to the spec. It just > depends on what you are optimizing for, small main object file size or > smallest encoding with least number of indirections. So, if I understand correctly, gcc's way of doing things (putting all the rnglists in a common .debug_rnglists section) reduces the overall size of debug info since the rnglists can use the direct addressing rnglists entries (e.g. DW_RLE_start_end) rather than the indirect ones (e.g. DW_RLE_startx_endx). But this come at the expense of a lot of relocations in the rnglists themselves, since they refer to addresses directly. I thought that the main point of split-units was to reduce the number of relocations processed by the linker and data moved around by the linker, to reduce link time and provide a better edit-build-debug cycle. Is that the case? Anyway, regardless of the intent, the spec should ideally be clear about that so we don't have to guess. > P.S. I am really interested in these interpretations of DWARF, but I > don't really follow the gdb implementation details very much. Could we > maybe move discussions like these from the -patches list to the main > gdb (or gcc) mailinglist? Sure, I added gdb@ and gcc@. I also left gdb-patches@ so that it's possible to follow the discussion there. Simon