labath wrote:

> I think this relates to:
> 
> > 2.17 Code Addresses, Ranges and Base Addresses
> > <...>
> > The base address of the scope for any of the debugging information entries 
> > listed
> > above is given by either the DW_AT_low_pc attribute or the first address in 
> > the
> > first range entry in the list of ranges given by the DW_AT_ranges 
> > attribute. If
> > there is no such attribute, the base address is undefined.
> 
> (https://dwarfstd.org/doc/DWARF5.pdf)
> 
> I'm just guessing here that the functions you refer to have both. It doesn't 
> say which should win, but we can take "or" to mean "whichever actually makes 
> sense", and in these cases you'd assume that low_pc is maybe the entry point 
> of the function not the actual lowest PC value some block of it sits at.

It is related to that, and I think this is the correct reading of that 
paragraph, but the situation is simpler than that.

> 
> Is that roughly the logic of this change? Sounds like the check itself was 
> not 100% correct to begin with, perhaps it was for some previous standard 
> version.

The function just has a DW_AT_ranges attribute like this:
```
0x00085f0f:   DW_TAG_subprogram
                DW_AT_name      ("_dl_start")
                DW_AT_decl_file ("./elf/rtld.c")
                DW_AT_decl_line (518)
                DW_AT_decl_column       (1)
                DW_AT_prototyped        (true)
                DW_AT_type      (0x0007d5fb "Elf64_Addr")
                DW_AT_ranges    (0x00004586
                   [0x000000000001cfb0, 0x000000000001d680)
                   [0x0000000000001a0f, 0x0000000000001b07))
```

which (according to the paragraph you quote) means that the function's entry 
point is 0x1cfb0, which is *not* the lowest address in the function.

Now this part is still fine. The problem starts when we start parsing nested 
blocks:
```
0x00085f3b:     DW_TAG_lexical_block
                  DW_AT_ranges  (0x0000462b
                     [0x000000000001d1f8, 0x000000000001d488)
                     [0x000000000001d4e0, 0x000000000001d550)
                     [0x000000000001d608, 0x000000000001d62e)
                     [0x0000000000001ae8, 0x0000000000001b07))
```

This block contains a range (the last one) which is below the functions entry 
point (which is called that because it *can*, and often is set by DW_AT_low_pc 
-- I guess we should rename that), even though it's still within the bounds of 
the function. This code was correct in a world where the functions are always 
contiguous and start at the first instruction, but that's not the case now (and 
strictly speaking, it never was). I'm not entirely sure what prompted this 
check to be added in the first place, but the two candidates I can think of are:
- bad ranges which would cause the internal block representation to overflow 
(because the blocks are stored as offsets from the entry point). This is no 
longer a problem because blocks now use signed numbers for offsets.
- to catch ranges which have been eliminated by the linker (in this case, their 
address is usually set to zero). I don't know if the linker can eliminate a 
part of the function, but if it can, this would now be caught by the 
m_first_code_address check (which is a relatively new invention).


(BTW, this warning does not show up on old LLDB's -- before I started 
implementing support for these kinds of functions, because then LLDB just 
picked the lowest address as the function entry point -- and so this check 
would not fire)

https://github.com/llvm/llvm-project/pull/132395
_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to