buildbot status

2021-09-10 Thread Mark Wielaard
Hi,

Since the splitting of the mega debuginfod tests into separate
testcases we have seen some instability of the buildbot. I believe that
is mostly solved now.

I do sometimes see one of the run-debuginfod-federation-
{link,metric,sqlite}.sh tests fail when they are testing propagating of
a client request between servers when the request is for some symlinked
debuginfo. But when I showed that to Frank he immediately said that the
shell logic for that test was bogus, so I am hoping that when he cleans
that up that also gets rid of the false positives. At least we should
now get better error reports when something fails (previously a runtest
that failed, but not a normal shell command, would simply cleanup
instead of spitting out the metrics and server logs).

There also seems to be a spurious fail of the native core backtrace
test on centos-x86_64. I looked at it, but cannot really figure it out.
When it fails we believe to be in __clone inside libc.so.6 but at a pc
that doesn't have cfi, so we cannot unwind. Maybe this is simply a bug
in either the core file writing under centos7 or glibc really is
missing some cfi for clone. But that doesn't explain why it is non-
deterministic.

Finally things took a really long time because most buildbot workers
were doing a full distcheck, which involves a normal build, a normal
make check, generating a dist tar ball, unpacking it again, valgrind,
the gcc undefined   sanitizer, a srcdir == builddir, a srcdir !=
builddir, an install check and another build check. On some builders
this took more than an hour. And for some reason the s390x builder was
so slow that it took 6 hours (!) for each commit.

So I have reconfigured the builders a bit:

- Only elfutils-fedora-x86_64 does a full distcheck
- elfutils-centos-x86_64, elfutils-debian-amd64, elfutils-debian-i386,
  elfutils-fedora-ppc64le and elfutils-fedora-s390x
  configure --enable-valgrind --enable-sanitize-undefined and do a
  make && make check
- elfutils-debian-arm64
  configure --enable-valgrind && make && make check
- elfutils-fedora-ppc64 and elfutils-debian-armhf
  configure without valgrind and sanitizer and simply
  make && make check

Hopefully that gives us buildbot results a bit quicker.

Cheers,

Mark


[PATCH] PR27940 - The /* pc=0x... */ is no longer printed by "stap -v -L 'kernel.function("*")'

2021-09-10 Thread Di Chen via Elfutils-devel
>From a98718a1b97357e53cef966ed9826c69e159f46e Mon Sep 17 00:00:00 2001
From: Di Chen 
Date: Thu, 2 Sep 2021 12:52:47 +0800
Subject: [PATCH] The /* pc=0x... */ is no longer printed by "stap -v -L
 'kernel.function("*")'

The disappeared /* pc=0x... */ resulted from the missing implementation
of the function "dwarf_derived_probe::printsig_nonest".
Which makes "p->printsig_nonest(sig)" in main.cxx end up calling
"derived_probe::printsig_nonest", and the type of "p" is

(gdb) ptype /m p
type = /* real type = dwarf_derived_probe * */

This patch added "dwarf_derived_probe::printsig_nonest" for PC value
print.

https://sourceware.org/bugzilla/show_bug.cgi?id=27940

Signed-off-by: Di Chen 
---
 elaborate.h |  2 +-
 tapsets.cxx | 11 +++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/elaborate.h b/elaborate.h
index 9817172a7..baa29e67c 100644
--- a/elaborate.h
+++ b/elaborate.h
@@ -204,7 +204,7 @@ struct derived_probe: public probe
   virtual probe_point* sole_location () const;
   virtual probe_point* script_location () const;
   virtual void printsig (std::ostream &o) const cxx_override;
-  void printsig_nonest (std::ostream &o) const;
+  virtual void printsig_nonest (std::ostream &o) const;
   // return arguments of probe if there
   virtual void getargs (std::list &) const {}
   void printsig_nested (std::ostream &o) const;
diff --git a/tapsets.cxx b/tapsets.cxx
index 68b75610e..27d8995ce 100644
--- a/tapsets.cxx
+++ b/tapsets.cxx
@@ -560,6 +560,7 @@ struct dwarf_derived_probe: public
generic_kprobe_derived_probe
   bool access_vars;

   void printsig (std::ostream &o) const;
+  void printsig_nonest (std::ostream &o) const;
   virtual void join_group (systemtap_session& s);
   void emit_probe_local_init(systemtap_session& s, translator_output * o);
   void getargs(std::list &arg_set) const;
@@ -5386,6 +5387,16 @@ dwarf_derived_probe::printsig (ostream& o) const
 }


+void
+dwarf_derived_probe::printsig_nonest (ostream& o) const
+{
+  sole_location()->print (o);
+  if (symbol_name != "")
+o << " /* pc=<" << symbol_name << "+" << offset << "> */";
+  else
+o << " /* pc=" << section << "+0x" << hex << addr << dec << " */";
+}
+

 void
 dwarf_derived_probe::join_group (systemtap_session& s)
-- 
2.31.1


[PATCH] read inlining info in an NVIDIA extended line map (was: Extension ...)

2021-09-10 Thread John Mellor-Crummey via Elfutils-devel
My previous patch submission seems to have been overlooked as buildbot issues 
consumed several days this week. However, discussion in the mailing list now 
seems to have moved on beyond my submission and I would like my patch 
considered. Here, I echo my previous submission, except I improved my 
submission by including the prefix [PATCH] in the subject and I included the 
patch in the message body rather than as an attachment.

Regarding the DWARF proposal by Cary Coutant for two-level linemaps: I now 
believe that NVIDIA’s implementation is consistent with that proposal although 
NVIDIA uses a DWARF extended opcode for inlined calls whereas Cary’s proposal 
uses DW_LNS_inlined_call (a standard opcode), NVIDIA’s implementation uses 
DW_LNE_inlined_call (an extended opcode).

A note about the source code for the test case reading the extended linemap 
entries using libdw: this code was copied from another test that used 
dwarf_next_lines and extended with code that reads the new context and 
functionname fields of a line table entry.
--
John Mellor-Crummey Professor
Dept of Computer ScienceRice University
email: joh...@rice.edu  phone: 713-348-5179

> On Sep 5, 2021, at 7:07 PM, John Mellor-Crummey  wrote:
> 
> As of CUDA 11.2, NVIDIA added extensions to the line map section
> of CUDA binaries to represent inlined functions. These extensions
> include
> 
> - two new fields in a line table row to represent inline 
>   information: context, and functionname,
> 
> - two new DWARF extended opcodes: DW_LNE_inlined_call, 
>   DW_LNE_set_function_name,
> 
> - an additional word in the line table header that indicates 
>   the offset in the .debug_str function where the function 
>   names for this line table begin, and
> 
> - two new functions in the libdw API: dwarf_linecontext and 
>   dwarf_linefunctionname, which return the new line table fields.
> 
> A line table row for an inlined function contains a non-zero
> "context" value. The “context” field indicates the index of the
> line table row that serves as the call site for an inlined
> context.
> 
> The "functionname" field in a line table row is only meaningful
> if the "context" field of the row is non-zero. A meaningful
> "functionname" field contains an index into the .debug_str
> section relative to the base offset established in the line table
> header; the position in the .debug_str section indicates the name
> of the inlined function.
> 
> These extensions resemble the proposed DWARF extensions
> (http://dwarfstd.org/ShowIssue.php?issue=140906.1) by Cary
> Coutant, but are not identical.
> 
> This patch adds integrates support for handling NVIDIA's extended
> line maps into elfutil's libdw library and the readelf command
> line utility.
> 
> Since this support is a non-standard extension to DWARF, all code
> that implements the extensions is implemented between markers  
> /* Begin NVIDIA_LINEMAP_INLINING_EXTENSIONS */ and 
> /* End NVIDIA_LINEMAP_INLINING_EXTENSIONS */.
> 
> The definition below
> 
> #define NVIDIA_LINEMAP_INLINING_EXTENSIONS 1
> 
> is added to elfutils/version.h, which enables a client of elfutils 
> to test whether the NVIDIA line map extensions are present. 
> 
> Note: The support for NVIDIA extended line maps adds two integer
> fields (context and functionname) to struct Dwarf_Line_s, which
> makes the structure about 30% larger.
> 
> The patch includes a binary testfile_nvidia_linemap.bz2 that
> contains an NVIDIA extended linemap along with two tests that
> read the line map.
> 
> - A test script run-nvidia-extended-linemap-readelf.sh 
>   checks the output of readelf on the new test binary to 
>   validate its dump of the line op codes containing context 
>   and functionname entries.
> 
> - A test program tests/nvidia_extended_linemap_libdw.c reads 
>   the extended line map with dwarf_next_lines and dumps the 
>   context and functionname fields of the line map where they 
>   are relevant, i.e. the value of context is non-zero. A test 
>   script run-nvidia-extended-linemap-libdw.sh runs this test 
>   and validates its output.
> 
> A patch with the new functionality described above is attached.
> --
> John Mellor-Crummey   Professor
> Dept of Computer Science  Rice University
> email: joh...@rice.eduphone: 713-348-5179

From 32e6b3530677bcbb595ba9010d49feef03314bd4 Mon Sep 17 00:00:00 2001
From: John M Mellor-Crummey 
Date: Fri, 3 Sep 2021 16:13:40 -0500
Subject: [PATCH] Read inlining info in NVIDIA extended line map

Signed-off-by: John M Mellor-Crummey 
---
 ChangeLog|   4 +
 config/version.h.in  |   4 +
 libdw/Makefile.am|   1 +
 libdw/dwarf.h|   4 +
 libdw/dwarf_getsrclines.c|  52 ++-
 libdw/dwarf_linecontext.c|  47 ++
 libdw/dwarf_linefunctionname.c   |  47 ++
 libdw/libdw.h 

[Bug debuginfod/28242] extend server http-response metrics with result code

2021-09-10 Thread dichen at redhat dot com via Elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=28242

Di Chen  changed:

   What|Removed |Added

 CC||dichen at redhat dot com
   Assignee|unassigned at sourceware dot org   |dichen at redhat dot com

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Re: [PATCH] read inlining info in an NVIDIA extended line map (was: Extension ...)

2021-09-10 Thread Mark Wielaard
Hi John,

On Fri, 2021-09-10 at 10:49 -0500, John Mellor-Crummey via Elfutils-
devel wrote:
> My previous patch submission seems to have been overlooked as
> buildbot issues consumed several days this week. However, discussion
> in the mailing list now seems to have moved on beyond my submission
> and I would like my patch considered. Here, I echo my previous
> submission, except I improved my submission by including the prefix
> [PATCH] in the subject and I included the patch in the message body
> rather than as an attachment.

Sorry for the buildbot noise, that was indeed a little painful. But a
buildbot that randomly fails is not much fun, so it took highest
priority to get it back to green.

Your submission is really nice, having extensive documentation, all
features implemented, a testcase. Well done.

There are however some concerns outside your control. It is somewhat
disappointing NVIDIA didn't document this themselves, or tried to
standardize this. You seems to have a good grasp of what was intended
though. We have to be careful not to extend the api in a way that makes
a better standard/implementation impossible. And the way we implemented
Dwarf_Lines isn't ideal, so this extension shows we should maybe change
it to be more efficient/compact. But maybe we can do that after adding
the extension, we should however have a plan.

> Regarding the DWARF proposal by Cary Coutant for two-level linemaps:
> I now believe that NVIDIA’s implementation is consistent with that
> proposal although NVIDIA uses a DWARF extended opcode for inlined
> calls whereas Cary’s proposal uses DW_LNS_inlined_call (a standard
> opcode), NVIDIA’s implementation uses DW_LNE_inlined_call (an
> extended opcode).

The naming is one of the concerns. It would be better to use a name
like DW_LNE_NVIDIA_inlined_call and DW_LNE_NVIDIA_set_function_name to
show they are vendor extensions and don't clash with possible future
standard opcode names.

That it mimics the two-level linemaps proposal is a good thing. But
lets make sure that the new accessor functions, dwarf_linecontext and
dwarf_linefunctionname, are generic enough that they can hopefully be
reused when two-level linemaps or a similar proposal becomes a
standard.

> A note about the source code for the test case reading the extended
> linemap entries using libdw: this code was copied from another test
> that used dwarf_next_lines and extended with code that reads the new
> context and functionname fields of a line table entry.

Thanks for the test case, it makes clear how the new functionality can
be used. How was the test binary, testfile_nvidia_linemap, generated?
That should probably be documented inside the testcase.

I won't be able to review all the code right now, but here are some
high-level comments, so you know what I am thinking.

On Sep 5, 2021, at 7:07 PM, John Mellor-Crummey 
> > wrote:
> > 
> > As of CUDA 11.2, NVIDIA added extensions to the line map section
> > of CUDA binaries to represent inlined functions. These extensions
> > include
> > 
> > - two new fields in a line table row to represent inline 
> >   information: context, and functionname,

We didn't design the Dwarf_Line_s very well/compact. We already have
the op_index, isa and discriminator fields which are almost never used.
This adds two more. I wonder if we can split the struct up so that the
extra fields are only used when actually used.

Maybe this can be done with a flag in Dwarf_Lines_s indicating whether
the array contains only the basic line attributes or also some extended
values. Of course this makes dwarf_getsrclines even more complex
because it then has to expand the line state struct whenever it first
sees an extended attribute. But I really like to see if we can use less
memory here. If we agree on some way to make that possible we can
implement it afterwards.

> > - two new DWARF extended opcodes: DW_LNE_inlined_call, 
> >   DW_LNE_set_function_name,

Like I said above I think these names should contain the "vendor" name
DW_LNE_NVIDIA_...

> > - an additional word in the line table header that indicates 
> >   the offset in the .debug_str function where the function 
> >   names for this line table begin, and

This is the only part I think is somewhat tricky. I believe you
implement it cleverly by checking the header_length. And your
implementation should work, but it is also the part that no other tool
understands, which means any such binary cannot really be handled by
any other tool. And I don't really understand how this works when
linking objects with such a offset into .debug_str. Normally the
.debug_str section contains strings that can be merged, but that would
disrupt the order, so I don't fully understand how the function_name
indexes are kept correct. This would have been nicer as a DWARF5
extension, where there is already a .debug_str_offsets section (but
also confusing because there is also a special .debug_line_str section
where these strings should probably point at).

[PATCH] lib: Make error.c more like error(3)

2021-09-10 Thread Colin Cross via Elfutils-devel
Fix some issues with the error reimplementation to make it match
the specification for error(3).

Flush stdout before printing to stderr.  Also flush stderr afterwards,
which is not specified in the man page for error(3), but is what
bionic does.

error(3) prints strerror(errnum) if and only if errnum is nonzero,
but verr prints strerror(errno) unconditionaly.  When errnum is nonzero
copy it to errno and use verr, and when it is not set use verrx that
doesn't print errno.

error(3) only exits if status is nonzero, but verr exits uncondtionally.
Use vwarn/vwarnx when status is zero, which don't exit.

Signed-off-by: Colin Cross 
---
 lib/error.c | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/lib/error.c b/lib/error.c
index 75e964fd..dba046ef 100644
--- a/lib/error.c
+++ b/lib/error.c
@@ -29,7 +29,9 @@
 #include 
 
 #if !defined(HAVE_ERROR_H) && defined(HAVE_ERR_H)
+#include 
 #include 
+#include 
 #include 
 #include 
 
@@ -37,13 +39,32 @@ unsigned int error_message_count = 0;
 
 void error(int status, int errnum, const char *format, ...) {
   va_list argp;
+  int saved_errno = errno;
+
+  fflush(stdout);
 
   va_start(argp, format);
-  verr(status, format, argp);
+  if (status) {
+if (errnum) {
+  errno = errnum;
+  verr(status, format, argp);
+} else {
+  verrx(status, format, argp);
+}
+  } else {
+if (errnum) {
+  errno = errnum;
+  vwarn(format, argp);
+} else {
+  vwarnx(format, argp);
+}
+  }
   va_end(argp);
 
-  if (status)
-exit(status);
+  fflush(stderr);
+
   ++error_message_count;
+
+  errno = saved_errno;
 }
 #endif
-- 
2.33.0.309.g3052b89438-goog



Re: [Bug debuginfod/27277] Describe retrieved files when verbose

2021-09-10 Thread Noah Sanci via Elfutils-devel
Hello,

The updated patch is attached.

On Wed, Sep 8, 2021 at 4:56 PM Mark Wielaard  wrote:
> On Fri, Aug 27, 2021 at 02:38:27PM -0400, Noah Sanci via Elfutils-devel wrote:
> > From ed7638571f188e346dd466c195b9ebda028d1c65 Mon Sep 17 00:00:00 2001
> > From: Noah Sanci 
> > Date: Wed, 28 Jul 2021 14:46:05 -0400
> > Subject: [PATCH] debuginfod: PR27277 - Describe retrieved files when verbose
> >
> > There appear to exist use cases that intend to simply check for the
> > existence of content in a debuginfod server, without actually
> > downloading it.  In HTTP land, the HEAD operation is the natural
> > expression of this.
> > Instead of implementing a HEAD/describe option, allow users, with enough
> > verbosity, to print the HTTP response headers upon retrieving a file.
>
> Same comment as earlier, but now please also state the debuginfod
> changes itself (the extra headers added).
Should be good.

> > E.g output:
> >
> > HTTP/1.1 200 OK
> > Connection: Keep-Alive
> > Content-Length: 2428240
> > Cache-Control: public
> > Last-Modified: Sat, 15 May 2021 20:49:51 GMT
> > X-FILE: "file name"
> > X-FILE-SIZE: "byte length of file"
>
> I think an actual output example is better than including "place
> holder text".
Fixed

> > diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx
> > index 6e182a84..e9b7e76c 100644
> > --- a/debuginfod/debuginfod.cxx
> > +++ b/debuginfod/debuginfod.cxx
> > @@ -1068,6 +1068,9 @@ handle_buildid_f_match (bool internal_req_t,
> >else
> >  {
> >MHD_add_response_header (r, "Content-Type", 
> > "application/octet-stream");
> > +  std::string file = b_source0.substr(b_source0.find_last_of("/")+1, 
> > b_source0.length());
> > +  MHD_add_response_header (r, "X-FILE-SIZE", 
> > to_string(s.st_size).c_str() );
> > +  MHD_add_response_header (r, "X-FILE", file.c_str() );
> >add_mhd_last_modified (r, s.st_mtime);
> >if (verbose > 1)
> >  obatched(clog) << "serving file " << b_source0 << endl;
> > @@ -1537,6 +1540,9 @@ handle_buildid_r_match (bool internal_req_p,
> >inc_metric ("http_responses_total","result","archive fdcache");
> >
> >MHD_add_response_header (r, "Content-Type", 
> > "application/octet-stream");
> > +  MHD_add_response_header (r, "X-FILE-SIZE", 
> > to_string(fs.st_size).c_str());
> > +  MHD_add_response_header (r, "X-ARCHIVE", b_source0.c_str());
> > +  MHD_add_response_header (r, "X-FILE", b_source1.c_str());
> >add_mhd_last_modified (r, fs.st_mtime);
> >if (verbose > 1)
> >  obatched(clog) << "serving fdcache archive " << b_source0 << " 
> > file " << b_source1 << endl;
> > @@ -1678,6 +1684,11 @@ handle_buildid_r_match (bool internal_req_p,
> >else
> >  {
> >MHD_add_response_header (r, "Content-Type", 
> > "application/octet-stream");
> > +  std::string file = 
> > b_source1.substr(b_source1.find_last_of("/")+1, b_source1.length());
> > +  MHD_add_response_header (r, "X-FILE-SIZE", 
> > to_string(fs.st_size).c_str());
> > +  MHD_add_response_header (r, "X-ARCHIVE", b_source0.c_str());
> > +  MHD_add_response_header (r, "X-FILE", file.c_str());
> > +
> >add_mhd_last_modified (r, archive_entry_mtime(e));
> >if (verbose > 1)
> >  obatched(clog) << "serving archive " << b_source0 << " file " 
> > << b_source1 << endl;
>
> This looks good. But I wonder if, since these are headers specific to
> debuginfod, they shouldn't be named X-DEBUGINFOD-... Just to make sure
> they don't clash with any others a proxy might insert.
Fixed

> > +2021-08-04  Noah Sanci  
> > +
> > + PR27277
> > + * debuginfod-find.1: Increasing verbosity describes the downloaded
> > + file.
> > + * debuginfod.8: Describe X-FILE, X-FILE-SIZE, and X-ARCHIVE.
> > +
>
> > diff --git a/doc/debuginfod.8 b/doc/debuginfod.8
> > index 5b0d793c..e9c58fbb 100644
> > --- a/doc/debuginfod.8
> > +++ b/doc/debuginfod.8
> > @@ -256,6 +256,13 @@ Unknown buildid / request combinations result in HTTP 
> > error codes.
> >  This file service resemblance is intentional, so that an installation
> >  can take advantage of standard HTTP management infrastructure.
> >
> > +Upon finding a file in an archive or simply in the database, some
> > +custom http headers are added to the response. For files in the
> > +database X-FILE and X-FILE-SIZE are added. X-FILE is simply the
> > +filename and X-FILE-SIZE is the size of the file. For files found
> > +in archives, in addition to X-FILE and X-FILE-SIZE, X-ARCHIVE is added.
> > +X-ARCHIVE is the name of the the file was found in.
>
> Probably should mention that the file name is unescaped (not url
> encoded).
Fixed
> The last sentence has too many "the". I think a word is
> missing between the last two.
Not sure if what I did is considered a fix

Thanks,
Noah Sanci
From 979f19eb4fd7a35ace4ddafed103922559b93120 Mon Sep 17 00:00:00 2001
From: Noah Sanci 
Date: Wed, 28