[Bug debuginfod/27913] extend webapi for gdb-index queries

2024-02-21 Thread fche at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=27913

Frank Ch. Eigler  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Frank Ch. Eigler  ---
previously completed client/server side work:

commit fb833750c395ce268d16fd44e4decafbb41208a8
Author: Aaron Merey 
Date:   Mon Oct 31 23:44:23 2022 -0400

debuginfod: Support queries for ELF/DWARF sections

-- 
You are receiving this mail because:
You are on the CC list for the bug.

[PATCH] libebl: ebl_object_note print 32bit annobin address ranges correctly

2024-02-21 Thread Mark Wielaard
Annobin address ranges were always printed as if they were 64bit wide
because addr_size was set to twice the size. This was done because the
note description size should contain two addresses. Fix this by setting
the address size to just one address and then check that descsz is
twice that.

* libebl/eblobjnote.c (ebl_object_note): Set addr_size to one
ELF_T_ADDR. Check descsz equals two times addr_size.

Signed-off-by: Mark Wielaard 
---
 libebl/eblobjnote.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libebl/eblobjnote.c b/libebl/eblobjnote.c
index 5a7c5c62..1ba5d8b3 100644
--- a/libebl/eblobjnote.c
+++ b/libebl/eblobjnote.c
@@ -155,8 +155,8 @@ ebl_object_note (Ebl *ebl, uint32_t namesz, const char 
*name, uint32_t type,
  } addrs;
 
  size_t addr_size = gelf_fsize (ebl->elf, ELF_T_ADDR,
-2, EV_CURRENT);
- if (descsz != addr_size)
+1, EV_CURRENT);
+ if (descsz != addr_size * 2)
printf ("\n");
  else
{
-- 
2.39.3



[PATCH] readelf: Use unsigned loop variables in handle_verneed and handle_verdef

2024-02-21 Thread Mark Wielaard
Prevent signed underflow by changing loop variables to unsigned and
doing count checks before decrementing. This isn't really a bug, but
prevents UB detected by ubsan on fuzzed input. The bad (fuzzed) input
data does get detected anyway.

* src/readelf.c (handle_verneed): Use unsigned cnt, cnt2.
(handle_verdef): Likewise.

Signed-off-by: Mark Wielaard 
---
 src/readelf.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/readelf.c b/src/readelf.c
index 802f8ede..0e931184 100644
--- a/src/readelf.c
+++ b/src/readelf.c
@@ -3159,7 +3159,7 @@ handle_verneed (Ebl *ebl, Elf_Scn *scn, GElf_Shdr *shdr)
  elf_strptr (ebl->elf, shstrndx, glink->sh_name));
 
   unsigned int offset = 0;
-  for (int cnt = shdr->sh_info; --cnt >= 0; )
+  for (unsigned int cnt = shdr->sh_info; cnt > 0; cnt--)
 {
   /* Get the data at the next offset.  */
   GElf_Verneed needmem;
@@ -3173,7 +3173,7 @@ handle_verneed (Ebl *ebl, Elf_Scn *scn, GElf_Shdr *shdr)
  (unsigned short int) need->vn_cnt);
 
   unsigned int auxoffset = offset + need->vn_aux;
-  for (int cnt2 = need->vn_cnt; --cnt2 >= 0; )
+  for (unsigned int cnt2 = need->vn_cnt; cnt2 > 0; cnt2--)
{
  GElf_Vernaux auxmem;
  GElf_Vernaux *aux = gelf_getvernaux (data, auxoffset, &auxmem);
@@ -3236,7 +3236,7 @@ handle_verdef (Ebl *ebl, Elf_Scn *scn, GElf_Shdr *shdr)
  elf_strptr (ebl->elf, shstrndx, glink->sh_name));
 
   unsigned int offset = 0;
-  for (int cnt = shdr->sh_info; --cnt >= 0; )
+  for (unsigned int cnt = shdr->sh_info; cnt > 0; cnt--)
 {
   /* Get the data at the next offset.  */
   GElf_Verdef defmem;
@@ -3259,7 +3259,7 @@ handle_verdef (Ebl *ebl, Elf_Scn *scn, GElf_Shdr *shdr)
  elf_strptr (ebl->elf, shdr->sh_link, aux->vda_name));
 
   auxoffset += aux->vda_next;
-  for (int cnt2 = 1; cnt2 < def->vd_cnt; ++cnt2)
+  for (unsigned int cnt2 = 1; cnt2 < def->vd_cnt; ++cnt2)
{
  aux = gelf_getverdaux (data, auxoffset, &auxmem);
  if (unlikely (aux == NULL))
-- 
2.39.3



Re: [PATCH v2] dwarf_getaranges: Build aranges list from CUs instead of .debug_aranges

2024-02-21 Thread Aaron Merey
Hi Mark,

On Tue, Feb 20, 2024 at 5:23 PM Mark Wielaard  wrote:
>
> > As for the number of aranges found, there is a difference for libxul.so:
> > 250435 with the patch compared to 254832 without.  So 4397 fewer aranges
> > are found when using the new CU iteration method.  I'll dig into this and
> > see if there is a problem or if it's just due to some redundancy in
> > libxul's .debug_aranges.  FWIW there was no change to the aranges counts
> > for the other modules searched during this eu-stack firefox corefile test.
>
> A quick way to see where the differences are is using
> eu-readelf --debug-dump=decodedaranges before/after your patch.
>
> This is opposite to what I expected. I had expected there to be more,
> instead of less ranges. The difference is less than 2%. But still
> interesting to know what/why.
>
> Were there any differences in the backtraces? If not, then those
> ranges might not actually have been mapping to code.

The backtraces were identical.  I took a closer look at this and the
difference is due to clang including DW_AT_location addresses in
.debug_aranges.  This patch just looks for DW_AT_{high,low}_pc and
DW_AT_ranges (via dwarf_ranges) when generating the aranges list,
so these DW_AT_location addresses aren't included.

According to David Blaikie [1], "GCC does not include globals in
debug_aranges, so they probably can't be relied upon unfortunately
(unfortunate that Clang pays the cost when it emits them, but consumers
can't rely on them)".

Since consumers already can't assume that .debug_aranges includes
DW_AT_location addresses, I think it's reasonable for us to not include
them when dynamically generating aranges.  Especially since there could
be a noticeable performance cost to doing so.

A separate issue I saw is that some entries in .debug_aranges with
a decoded start address of "00" wouldn't always
have matching entries in .debug_ranges.  This resulted in some dynamic
aranges not matching their corresponding .debug_aranges entry.

For example, the following decoded arange is in libxul's .debug_aranges:

start: 00, length:13, CU DIE offset: 95189496

In the .debug_info for this CU, there is a corresponding subprogram with
low_pc 0 and high_pc 13.

However in the .debug_ranges entry for this CU, there is no range starting
at address 0 with size of at least 13.  The closest range list entry is:

range 1, 1
+0x0001 <__ehdr_start+0x1>..
+00 

When we dynamically generate aranges this results in the following
decoded arange:

start: 0x0001, length: 0, CU DIE offset: 95189496

So the start address is off by 1 and the length doesn't match the
.debug_aranges entry.  In some similar cases there wouldn't even be
a corresponding "range 1, 1" entry in .debug_ranges.  I'm not yet sure
what's going on here.

>
> > > Might it be an idea to leave dwarf_getaranges as it is and introduce a
> > > new (internal) function to get "dynamic" ranges? It looks like what
> > > programs (like eu-stack and eu-addr2line) really use is dwarf_addrdie
> > > and dwfl_module_addrdie. These are currently build on dwarf_getaranges,
> > > but could maybe use a new interface?
> >
> > IMO this depends on what users expect from dwarf_getaranges.  Do they
> > want the exact contents of .debug_aranges (whether or not it's complete)
> > or should dwarf_getaranges go beyond .debug_aranges to ensure the most
> > complete results?
> >
> > The comment for dwarf_getaranges in libdw.h simply reads "Return list
> > address ranges".  Since there's no mention of .debug_aranges specifically,
> > I think it's fair if dwarf_getaranges does whatever it can to ensure
> > comprehensive results.  In which case dwarf_getaranges should probably
> > dynamically generate aranges.
>
> You might be right that no user really cares. But as seen in the
> eu-readelf code, it might also be that people expected it to map to
> the ranges from .debug_aranges.
>
> So I would be happier if we just kept the dwarf_getaranges code as
> is. And just change the code in dwarf_addrdie and dwfl_module_addrdie.
>
> We could then also introduce a new public function, dwarf_getdieranges
> (?) that does the new thing. But it doesn't have to be public on the
> first try as long as dwarf_addrdie and dwfl_module_addrdie work. (We
> might want to change the interface of dwarf_getdieranges so it can be
> "lazy" for example.)

Ok this approach seems like the most flexible.  Users can have both
.debug_aranges and CU-based aranges plus we don't have to change the
semantics of dwarf_getaranges.  I'll submit a revised patch for this.

Aaron

[1] https://reviews.llvm.org/D123538#inline-1188522