Missing .gnu_debugdata section on ARM platform when libdw is used by systemd-coredump

2021-05-04 Thread Tino Mettler via Elfutils-devel
Hi,

I have a system running on 2 different architectures: AMD64 and ARM. When a 
coredump happens, I want systemd-coredump to generate a stack trace of the 
crashing application. Systemd depends on elfutils for this feature. I use 
binaries with minidebuginfo (the LZMA compressed symbol table in the 
.gnu_debugdata section) on both architectures.

I get a stack trace on AMD64, but not on ARM. While debugging this, I saw that 
find_aux_sym() from dwfl_module_getdwarf.c does not find a .gnu_debugdata 
section when iterating through the ELF using elf_nextscn().

However, it finds a .gnu_debuglink section. I inspected the executable and 
verified that it contains a .gnu_debugdata section and it looks fine. 
Interestingly, there is no .gnu_debuglink section in the executable despite 
elf_nextscn() seems to find one.

It looks like libdw does not process the actual executable, but a modified 
variant, and the .gnu_debugdata section gets lost at some point on my ARM 
device. Can you give me a hint where I need to look at? Both devices run a 
different kernel with a different configuration. Could that be related?

I also tried gdb using the coredump file from systemd-coredump and the same 
executable, and a stack trace worked as expected.

Regards,
Tino



Re: Storing package metadata in ELF objects

2021-05-04 Thread Luca Boccassi
On Fri, 2021-04-30 at 19:57 +0200, Mark Wielaard wrote:
> Hi,
> 
> On Sat, 2021-04-10 at 18:44 +, Zbigniew Jędrzejewski-Szmek wrote:
> > [I'm forwarding the mail from Luca who is not subscribed to fedora-
> > devel]
> > On Sat, Apr 10, 2021 at 01:38:31PM +0100, Luca Boccassi wrote:
> > Cross-posting to the mailing lists of a few relevant projects.
> 
> Note that in this version of the email the [N] references in your email
> don't seem to point anywhere. I found an older variant of the same
> email which contained:
> 
> [0] https://github.com/systemd/systemd/issues/18433
> [1] https://systemd.io/COREDUMP_PACKAGE_METADATA/
> [2] https://fedoraproject.org/wiki/Changes/Package_information_on_ELF_objects
> [3] https://github.com/systemd/package-notes

Sorry about that! Must have messed up the copy&paste.

> > After an initial discussion [0], recently we have been working on a new
> > specification [0] to encode rich package-level metadata inside ELF
> > objects, so that it can be included automatically in generated coredump
> > files. The prototype to parse this in systemd-coredump and store the
> > information in systemd-journal is ready for testing and merged
> > upstream. We are now seeking further comments/opinions/suggestions, as
> > we have a few months before the next release and thus there's plenty of
> > time to make incompatible changes to the format and implementation, if
> > required.
> > 
> > A proposal to use this by default for all packages built in Fedora 35
> > has been submitted [1].
> > 
> > The Fedora Wiki and the systemd.io document have more details, but to
> > make a long story short, a new .notes.package section with a JSON
> > payload will be included in ELF objects, encoding various package-
> > build-time information like distro name&version, package
> > name&version,
> > etc.
> 
> Is there a list of default keys (and their canonical spelling, upper-
> lower-Camel_Case, etc.)? If there is, could we have a "debuginfod" key
> with as value an URL pointing to the debuginfod server URL where the
> embedded build-id executable, debuginfo and sources can be found?
> https://sourceware.org/elfutils/Debuginfod.html

The "Implementation" section of the spec lists the "main" fields:

https://systemd.io/COREDUMP_PACKAGE_METADATA/

(source for that is 
https://github.com/systemd/systemd/blob/main/docs/COREDUMP_PACKAGE_METADATA.md )

Would you like to send a PR to update it and add that field?

> > To summarize from the discussion, the main reasons why we believe this
> > is useful are as following:
> > 
> > 1) minimal containers: the rpm database is not installed in the
> > containers. The information about build-ids needs to be stored
> > externally, so package name information is not available immediately,
> > but only after offline processing. The new note doesn't depend on the
> > rpm db in any way.
> > 
> > 2) handling of a core from a container, where the container and host
> > have different distros
> > 
> > 3) self-built and external packages: unless a lot of care is taken to
> > keep access to the debuginfo packages, this information may be lost.
> > The new note is available even if the repository metadata gets lost.
> > Users can easily provide equivalent information in a format that makes
> > sense in their own environment. It should work even when rpms and debs
> > and other formats are mixed, e.g. during container image creation.
> > 
> > Other than in Fedora, we are already making the required code changes
> > at Microsoft to use the same format&specification for internally-built
> > binaries, and for tools that parse core files and logs.
> > 
> > Tools for RPM and DEB (debhelper) integration are also available [3].
> > 
> > > -- 
> > > Kind regards,
> > > Luca Boccassi

-- 
Kind regards,
Luca Boccassi


signature.asc
Description: This is a digitally signed message part


Windows build of libelf (or more)?

2021-05-04 Thread Stephan Tobies via Elfutils-devel
To refresh an existing library based on the old libelf code (mr511), I am 
looking for a way to compile (or cross-compile) the libelf of the elfutils for 
Windows. If have been searching the mailing list archive but have not found any 
pointers in that direction.

Is anybody aware of a way to compile libelf into a standalone Windows DLL?

Any pointers would be appreciated!

Thanks and best regards

Stephan Tobies

Dr. Stephan Tobies | Principal Architect | Verification Group
Tel: +49 (241) 479 - 67116 | Cell: +49 (151) 550-66974
Email: tob...@synopsys.com| Web: 
www.Synopsys.com
Sitz der Gesellschaft: Aschheim, Landkreis München - Rechtsform: GmbH - 
Amtsgericht München HRB 90949  - VAT/ USt-ld Nr. DE129470370
Geschäftsführung:  Gregor Wiethaler, Prof. Dr. Andreas Hoffmann, Orla Anne 
Murphy, Luisa Diogo



[PATCH] debuginfod: debuginfod client would cache negative results.

2021-05-04 Thread Alice Zhang via Elfutils-devel
add debuginfod_config_cache for reading and writing to cache
configuration files, make use of the function within
debuginfod_clean_cache and debuginfod_query_server.

in debuginfod_query_server, create 000-permission file on failed
queries. Before querying each BUILDID, if corresponding 000 file
detected, compare its stat mtime with parameter from
.cache/cache_miss_s. if mtime is fresher, then return ENOENT and
exit; otherwise unlink the 000 file and proceed to a new query.

tests: add test in run-debuginfod-find.sh

test if the 000 file is created on failed query; if querying the
same failed BUILDID, whether the query should proceed without
going through server; set the cache_miss_s to 0 and query the same
buildid, and this time should go through the server.

Signed-off-by: Alice Zhang 
---
 ChangeLog  |   4 ++
 NEWS   |   5 ++
 debuginfod/ChangeLog   |   4 ++
 debuginfod/debuginfod-client.c | 118 ++---
 tests/ChangeLog|   4 ++
 tests/run-debuginfod-find.sh   |  37 +++
 6 files changed, 134 insertions(+), 38 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index e18746fb..3c6f5cc0 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,7 @@
+2021-04-28  Alice Zhang  
+* debuginfod/debuginfod-client.c: Make client able to cache negative
+   results.
+   * tests/run-debuginfod-find.sh: Added tests for the feature.
 2021-03-30  Frank Ch. Eigler  
 
* configure.ac: Look for pthread_setname_np.
diff --git a/NEWS b/NEWS
index 038c6019..91c65160 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,11 @@ Version 0.184
 
 debuginfod: Use libarchive's bsdtar as the .deb-family file unpacker.
 
+debuginfod-client: Now client would cache negative results. If a query
+   for a file failed with 404, an empty 000 permission 
   
+   file is created, and this would save time by allowing
+   skipping queries that are likely to fail.
+
 Version 0.183
 
 debuginfod: New thread-busy metric and more detailed error metrics.
diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index ed2f77cf..520e66f9 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,7 @@
+2021-05-04  Alice Zhang 
+
+   * debuginfod-client.c: Make client able to cache negative results.
+
 2021-04-15  Frank Ch. Eigler 
 
* debuginfod.cxx (groom): Only update database stats once.
diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c
index d5e7bbdf..882e0386 100644
--- a/debuginfod/debuginfod-client.c
+++ b/debuginfod/debuginfod-client.c
@@ -131,6 +131,11 @@ struct debuginfod_client
 static const char *cache_clean_interval_filename = "cache_clean_interval_s";
 static const time_t cache_clean_default_interval_s = 86400; /* 1 day */
 
+/* The cache_miss_default_s within the debuginfod cache specifies how 
frequently
+   should the 000-permision file should be released.*/
+static const time_t cache_miss_default_s = 600; /* 10 min */
+static const char *cache_miss_filename = "cache_miss_s";
+
 /* The cache_max_unused_age_s file within the debuginfod cache specifies the
the maximum time since last access that a file will remain in the cache.  */
 static const char *cache_max_unused_age_filename = "max_unused_age_s";
@@ -206,6 +211,38 @@ debuginfod_write_callback (char *ptr, size_t size, size_t 
nmemb, void *data)
   return (size_t) write(d->fd, (void*)ptr, count);
 }
 
+/* handle config file read and write */
+static int
+debuginfod_config_cache(char *config_path,
+   long cache_config_default_s,
+  struct stat *st)
+{
+  int fd;
+  /* if the config file doesn't exist, create one with DEFFILEMODE*/
+  if(stat(config_path, st) == -1)
+{
+  fd = open(config_path, O_CREAT | O_RDWR, DEFFILEMODE);
+  if (fd < 0)
+return -errno;
+
+  if (dprintf(fd, "%ld", cache_config_default_s) < 0)
+return -errno;
+}
+
+  long cache_config;
+  FILE *config_file = fopen(config_path, "r");
+  if (config_file)
+{
+  if (fscanf(config_file, "%ld", &cache_config) != 1)
+cache_config = cache_config_default_s;
+  fclose(config_file);
+}
+  else
+cache_config = cache_config_default_s;
+
+  return cache_config;
+}
+
 /* Create the cache and interval file if they do not already exist.
Return 0 if cache and config file are initialized, otherwise return
the appropriate error code.  */
@@ -251,52 +288,26 @@ debuginfod_clean_cache(debuginfod_client *c,
   char *cache_path, char *interval_path,
   char *max_unused_path)
 {
+  time_t clean_interval, max_unused_age;
+  int rc = -1;
   struct stat st;
-  FILE *interval_file;
-  FILE *max_unused_file;
-
-  if (stat(interval_path, &st) == -1)
-{
-  /* Create new interval file.  */
-  interval_file = fopen(interval_path, "w");
 
-  if (interval_file == NULL)
- 

Re: [PATCH] libdw: handle DW_FORM_indirect when reading attributes

2021-05-04 Thread Omar Sandoval
On Sat, May 01, 2021 at 05:59:31PM +0200, Mark Wielaard wrote:
> Hi Omar,
> 
> On Fri, 2021-04-23 at 16:36 -0700, Omar Sandoval wrote:
> > Whenever we encounter an attribute with DW_FORM_indirect, we need to
> > read its true form from the DIE data. Then, we can continue normally.
> > This adds support to the most obvious places: __libdw_find_attr() and
> > dwarf_getattrs().
> 
> Very nice. I wasn't aware anybody actually used DW_FORM_indirect.
> 
> In your patch you only handle one indirection. I think that in theory a
> DW_FORM_indirect could be an DW_FORM_indirect itself (but now in the
> DIE instead of the Abbrev). But this is probably silly, so not
> supporting that seems correct. Same for an indirect
> DW_FORM_implicit_const. It doesn't really make sense to do that because
> if you wanted to put the value in the DIE instead of an Abbrev you
> would have used an actual FORM that did that.
> 
> > There may be more places that need to be updated.
> 
> There is __libdw_form_val_compute_len which already handles
> DW_FORM_indirect:
> 
> case DW_FORM_indirect:
>   get_uleb128 (u128, valp, endp);
>   // XXX Is this really correct?
>   result = __libdw_form_val_len (cu, u128, valp);
>   if (result != (size_t) -1)
> result += valp - startp;
>   else
> return (size_t) -1;
>   break;
> 
> I believe the XXX question can be answered with: Yes, the result is the
> size of the actual FORM data plus the size of the uleb128 encoding that
> FORM (which is valp - startp). And it probably should check like your
> code does that valp != DW_FORM_indirect && valp !=
> DW_FORM_implicit_const. I'll sent a patch to do that.
> 
> But, now that dwarf_child and dwarf_getattrs handle DW_FORM_indirectly
> directly (haha, pun intended) I think __libdw_form_val_compute_len is
> never called for DW_FORM_indirect anymore.
> 
> There is dwarf_hasattr, but that doesn't care about the value, just the
> attribute name, so it doesn't have to follow any DW_FORM_indirect.
> 
> And there are the "raw" dwarf_getabbrev functions, but those aren't
> associated with any DIE/.debug_info data, so leaving DW_FORM_indirect
> as is in the Dwarf_Abbrev for those seems correct.
> 
> > I encountered this when inspecting a file that was processed by our BOLT
> > tool: https://github.com/facebookincubator/BOLT. This also adds a couple
> > of test cases using a file generated by that tool.
> 
> Thanks for that test case. It really helps seeing an actual example of
> indirection. I note that all Abbrevs/DIEs using DW_FORM_indirect are
> for DW_AT_low_pc and that the indirect DW_AT_addr are all the zero
> address. Is that intended?

>From manually inspecting the debug information, I believe this is
correct. Looking into it more, BOLT uses DW_FORM_indirect for silly
reasons. Something to do with using it to pad out values in place so
that .debug_abbrev and .debug_info don't need to be rewritten entirely
in the patched binary.

> Pushed you patch.

Thank you!


Re: [PATCH] libdw: handle DW_FORM_indirect when reading attributes

2021-05-04 Thread Omar Sandoval
On Sat, May 01, 2021 at 06:03:37PM +0200, Mark Wielaard wrote:
> Hi,
> 
> On Sat, 2021-05-01 at 17:59 +0200, Mark Wielaard wrote:
> > There is __libdw_form_val_compute_len which already handles
> > DW_FORM_indirect:
> > 
> > case DW_FORM_indirect:
> >   get_uleb128 (u128, valp, endp);
> >   // XXX Is this really correct?
> >   result = __libdw_form_val_len (cu, u128, valp);
> >   if (result != (size_t) -1)
> > result += valp - startp;
> >   else
> > return (size_t) -1;
> >   break;
> > 
> > I believe the XXX question can be answered with: Yes, the result is the
> > size of the actual FORM data plus the size of the uleb128 encoding that
> > FORM (which is valp - startp). And it probably should check like your
> > code does that valp != DW_FORM_indirect && valp !=
> > DW_FORM_implicit_const. I'll sent a patch to do that.
> 
> Patch attached.

The patch looks reasonable to me, Mark, thanks.