[Bug libelf/28101] New: elf_strptr slow with address sanitizer, passes entire section range to memrchr.

2021-07-19 Thread jan.smets at nokia dot com via Elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=28101

Bug ID: 28101
   Summary: elf_strptr slow with address sanitizer, passes entire
section range to memrchr.
   Product: elfutils
   Version: unspecified
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P2
 Component: libelf
  Assignee: unassigned at sourceware dot org
  Reporter: jan.smets at nokia dot com
CC: elfutils-devel at sourceware dot org
  Target Milestone: ---

This code block calls out to validate_str which calls to memrchr.
The to= is the size of the entire section.

I believe Address Sanitizer does validate the entire memory range. 
Passing the entire range to the memrchr null char check makes it very costly.


191   else if (likely (strscn->data_list_rear == NULL))
192 {
193   // XXX The above is currently correct since elf_newdata will
194   // make sure to convert the rawdata into the datalist if
195   // necessary. But it would be more efficient to keep the rawdata
196   // unconverted and only then iterate over the rest of the (newly
197   // added data) list.  Note that when the ELF file is mmapped
198   // rawdata_base can be set while rawdata.d hasn't been
199   // initialized yet (when data_read is zero). So we cannot just
200   // look at the rawdata.d.d_size.
201
202   /* Make sure the string is NUL terminated.  Start from the end,
203  which very likely is a NUL char.  */
204   if (likely (validate_str (strscn->rawdata_base, offset, sh_size)))
205 result = &strscn->rawdata_base[offset];
206   else
207 __libelf_seterrno (ELF_E_INVALID_INDEX);
208 }


* libelf compiled with HAVE_DECL_MEMRCHR (default)
* recent GCC toolchain (GCC6 and up)
* files themselves don't even need to be compiled with asan, just enabling it
at link time so the runtime wrapping/intercepting of libc et all fires.

Testcase:

test.c is attached.
Generate a source file with dummy symbols to populate the symbol table.


for i in {A..Z}{A..Z}{A..Z}{A..Z}; do echo "int sym$i;"; done > symbols.c
gcc -c symbols.c -o symbols.o
gcc -c test.c -o test.o -I /tmp/jan-test/elfutils/libelf/ #-fsanitize=address

gcc test.o symbols.o -o test  -l:libelf.a -L/tmp/jan-test/elfutils/libelf/  -lz
-fsanitize=address
echo -n "default asan:"
time ./test test

echo -n "asan disabled"
gcc test.o symbols.o -o test  -l:libelf.a -L/tmp/jan-test/elfutils/libelf/  -lz
time ./test test

I don't know of any ASAN_OPTIONS parameter that would change this behavior and
make it less strict.

On my machine the asan testcase takes 5+ sec, whereas the non-asan 0.04s.

Can this code please be optimized to reduce the range passed to memrchr?
Is this NUL check even required?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

[Bug libelf/28101] elf_strptr slow with address sanitizer, passes entire section range to memrchr.

2021-07-19 Thread mark at klomp dot org via Elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=28101

Mark Wielaard  changed:

   What|Removed |Added

 CC||mark at klomp dot org

--- Comment #1 from Mark Wielaard  ---
I think it really is a bug/performance issue in asan. But "optimizing" it in
libelf by first checking the last char is zero, before calling memrchr wouldn't
hurt (and should normally prevent a function call). Does the following help?

diff --git a/libelf/elf_strptr.c b/libelf/elf_strptr.c
index 76f2caf1..dc9b76c0 100644
--- a/libelf/elf_strptr.c
+++ b/libelf/elf_strptr.c
@@ -56,7 +56,9 @@ get_zdata (Elf_Scn *strscn)
 static bool validate_str (const char *str, size_t from, size_t to)
 {
 #if HAVE_DECL_MEMRCHR
-  return memrchr (&str[from], '\0', to - from) != NULL;
+  // Check end first, which is likely a zero terminator, to prevent function
call
+  return (str[to - 1]  == '\0'
+ || (to - from > 0 && memrchr (&str[from], '\0', to - from - 1) !=
NULL));
 #else
   do {
 if (to <= from)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

[Bug libelf/28101] elf_strptr slow with address sanitizer, passes entire section range to memrchr.

2021-07-19 Thread jan.smets at nokia dot com via Elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=28101

--- Comment #2 from Jan Smets  ---
The patch optimizes perfectly and avoids the expensive call.
Thanks!

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Re: [Bug debuginfod/27983] ignore duplicate urls

2021-07-19 Thread Noah Sanci via Elfutils-devel
Hello,


On Wed, Jul 14, 2021 at 12:36 PM Mark Wielaard  wrote:

> You deduplicate the full URLs after they are fully constructed.  Would
> it make sense to do the deduplication on server_url, maybe even as
> part of the Count number of URLs code? That might make the code
> simpler. And you can change num_urls upfront.
Deduplication before fully building the URL would work well, however I
was concerned about the slashbuildid situation. I would need to alter
all urls to either have a trailing '/' or no trailing '/' to ensure
comparison between 'http://127.0.0.1:8000/' and 'http://127.0.0.1:8000'
is considered equal. This is possible, but I ultimately decided to
wait until full construction as those issues would have been handled.
I would be glad to make the change if you want.
>
> > +  num_urls = unduplicated_urls;
> > +  data = reallocarray( (void *) data, num_urls, sizeof(struct 
> > handle_data));
>
> Maybe this reallocarray is unnecessary. Yes, it might save a little
> bit of memory, but you do have to handle reallocarray failure.
Good to know. I removed it.

Thanks

Noah
From be4e07a8732dd688595b99f92ba275ef5060a34d Mon Sep 17 00:00:00 2001
From: Noah Sanci 
Date: Fri, 9 Jul 2021 14:53:10 -0400
Subject: [PATCH] debuginfod: PR27983 - ignore duplicate urls

Gazing at server logs, one sees a minority of clients who appear to have
duplicate query traffic coming in: the same URL, milliseconds apart.
Chances are the user accidentally doubled her $DEBUGINFOD_URLS somehow,
and the client library is dutifully asking the servers TWICE. Bug #27863
reduces the pain on the servers' CPU, but dupe network traffic is still
being paid.  We should reject sending outright duplicate concurrent
traffic.

https://sourceware.org/bugzilla/show_bug.cgi?id=27983

Signed-off-by: Noah Sanci 
---
 debuginfod/ChangeLog   |  7 +
 debuginfod/debuginfod-client.c | 55 +-
 tests/ChangeLog|  5 
 tests/run-debuginfod-find.sh   | 13 
 4 files changed, 66 insertions(+), 14 deletions(-)

diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index d9d11737..24ccb8ef 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,10 @@
+2021-07-09  Noah Sanci  
+
+	* debuginfod-client.c (debuginfod_query_server): As full-length
+	urls are generated with standardized formats, ignore duplicates.
+	Also update the number of urls to the unduplicated number of
+	urls.
+
 2021-06-18  Mark Wielaard  
 
 	* debuginfod-client.c (debuginfod_begin): Don't use client if
diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c
index f417b40a..a9447cae 100644
--- a/debuginfod/debuginfod-client.c
+++ b/debuginfod/debuginfod-client.c
@@ -795,6 +795,7 @@ debuginfod_query_server (debuginfod_client *c,
 
   char *strtok_saveptr;
   char *server_url = strtok_r(server_urls, url_delim, &strtok_saveptr);
+  int unduplicated_urls = 0;
 
   /* Initialize each handle.  */
   for (int i = 0; i < num_urls && server_url != NULL; i++)
@@ -802,34 +803,59 @@ debuginfod_query_server (debuginfod_client *c,
   if (vfd >= 0)
 	dprintf (vfd, "init server %d %s\n", i, server_url);
 
-  data[i].fd = fd;
-  data[i].target_handle = &target_handle;
-  data[i].handle = curl_easy_init();
-  if (data[i].handle == NULL)
-{
-  rc = -ENETUNREACH;
-  goto out1;
-}
-  data[i].client = c;
-
-  /* Build handle url. Tolerate both  http://foo:999  and
- http://foo:999/  forms */
   char *slashbuildid;
   if (strlen(server_url) > 1 && server_url[strlen(server_url)-1] == '/')
 slashbuildid = "buildid";
   else
 slashbuildid = "/buildid";
 
+  char *tmp_url = calloc(PATH_MAX, sizeof(char));
   if (filename) /* must start with / */
-snprintf(data[i].url, PATH_MAX, "%s%s/%s/%s%s", server_url,
+snprintf(tmp_url, PATH_MAX, "%s%s/%s/%s%s", server_url,
  slashbuildid, build_id_bytes, type, filename);
   else
-snprintf(data[i].url, PATH_MAX, "%s%s/%s/%s", server_url,
+snprintf(tmp_url, PATH_MAX, "%s%s/%s/%s", server_url,
  slashbuildid, build_id_bytes, type);
 
+  /* PR 27983: If the url is already set to be used use, skip it */
+  int url_index = -1;
+  for (url_index = 0; url_index < i; ++url_index)
+{
+  if(strcmp(tmp_url, data[url_index].url) == 0)
+{
+  url_index = -1;
+  break;
+}
+}
+  if (url_index == -1)
+{
+  if (vfd >= 0)
+dprintf(vfd, "duplicate url: %s, skipping\n", tmp_url);
+  free(tmp_url);
+  // Ensure that next iteration doesn't skip over an index mid-array
+  i--;
+  server_url = strtok_r(NULL, url_delim, &strtok_saveptr);
+  continue;
+}
+  else
+{
+  unduplicated_urls++;
+  strncpy(data[i].url, tmp_url, PATH_MAX);
+  fr

[Bug libelf/28101] elf_strptr slow with address sanitizer, passes entire section range to memrchr.

2021-07-19 Thread mark at klomp dot org via Elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=28101

Mark Wielaard  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Mark Wielaard  ---
(In reply to Jan Smets from comment #2)
> The patch optimizes perfectly and avoids the expensive call.

Thanks for testing, I pushed a variant of the fix as:

commit 0aed4315b2f6c54f4efcf8a8d22e59a36e6eb30d
Author: Mark Wielaard 
Date:   Mon Jul 19 15:52:51 2021 +0200

libelf: Optimize elf_strptr.c validate_str by checking last char first

In most cases the last char of the sectio will be zero. Check that
first before calling memrchr. This is a minor optimization in normal
cases. But it helps asan a lot by removing the memrchr call in most
cases.

https://sourceware.org/bugzilla/show_bug.cgi?id=28101

Signed-off-by: Mark Wielaard 

-- 
You are receiving this mail because:
You are on the CC list for the bug.

[Bug debuginfod/28034] %-escape url characters

2021-07-19 Thread Noah Sanci via Elfutils-devel
Hello,

Please find the patch for bug 28034 attached.

Noah
From 1e2340a9732bdc8f9a7a207a870e6815c770c23c Mon Sep 17 00:00:00 2001
From: Noah Sanci 
Date: Fri, 16 Jul 2021 15:16:20 -0400
Subject: [PATCH] debuginfod: PR28034 - %-escape url characters

When requesting some source files, some URL-inconvenient chars
sometimes pop up.  Example from f33 libstdc++:
/buildid/44d8485cb75512c2ca5c8f70afbd475cae30af4f/source/usr/src/debug/
gcc-10.3.1-1.fc33.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/
libstdc++-v3/src/c++11/../../../../../libstdc++-v3/src/c++11/
condition_variable.cc
As this URL is passed into debuginfod's handler_cb, it appears that the
+ signs are helpfully unescaped to spaces by libmicrohttpd, which
'course breaks everything.  We need to suppress this HTTP URL
processing step.  Also worth checking that %HEX decoding is also
suppressed.

https://sourceware.org/bugzilla/show_bug.cgi?id=28034

Signed-off-by: Noah Sanci 
---
 debuginfod/ChangeLog   |  6 ++
 debuginfod/debuginfod-client.c | 12 ---
 doc/debuginfod.8   |  1 +
 tests/ChangeLog|  6 ++
 tests/run-debuginfod-find.sh   | 37 ++
 5 files changed, 42 insertions(+), 20 deletions(-)

diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index e71436ca..aad35a4e 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,3 +1,9 @@
+2021-07-16  Noah Sanci  
+
+	PR28034
+	* debuginfod-client.c (debuginfod_query_server): % escape filename
+	so the completed url is processed properly.
+
 2021-07-06  Alice Zhang  
 
 PR27531
diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c
index f12f609c..e30f73eb 100644
--- a/debuginfod/debuginfod-client.c
+++ b/debuginfod/debuginfod-client.c
@@ -831,9 +831,15 @@ debuginfod_query_server (debuginfod_client *c,
   else
 slashbuildid = "/buildid";
 
-  if (filename) /* must start with / */
-snprintf(data[i].url, PATH_MAX, "%s%s/%s/%s%s", server_url,
- slashbuildid, build_id_bytes, type, filename);
+  if (filename)/* must start with / */
+{
+  /* PR28034 escape characters in completed url to %hh format. */
+  char *escaped_string;
+  escaped_string = curl_easy_escape(data[i].handle, filename, 0);
+  snprintf(data[i].url, PATH_MAX, "%s%s/%s/%s%s", server_url,
+   slashbuildid, build_id_bytes, type, escaped_string);
+  curl_free(escaped_string);
+}
   else
 snprintf(data[i].url, PATH_MAX, "%s%s/%s/%s", server_url,
  slashbuildid, build_id_bytes, type);
diff --git a/doc/debuginfod.8 b/doc/debuginfod.8
index 1adf703a..90cdcf94 100644
--- a/doc/debuginfod.8
+++ b/doc/debuginfod.8
@@ -299,6 +299,7 @@ l l.
 \../bar/foo.c AT_comp_dir=/zoo/	/buildid/BUILDID/source/zoo//../bar/foo.c
 .TE
 
+Note: the client should %-escape characters in /SOURCE/FILE that are not shown as "unreserved" in section 2.3 of RFC3986.
 .SS /metrics
 
 This endpoint returns a Prometheus formatted text/plain dump of a
diff --git a/tests/ChangeLog b/tests/ChangeLog
index 1460b422..cfa0dee4 100644
--- a/tests/ChangeLog
+++ b/tests/ChangeLog
@@ -1,3 +1,9 @@
+2021-07-16  Noah Sanci  
+
+	PR28034
+	* run-debuginfod-find.sh: Added a test case ensuring files with %
+	escapable characters in their paths are accessible.
+
 2021-07-06  Alice Zhang  
 
 PR27531
diff --git a/tests/run-debuginfod-find.sh b/tests/run-debuginfod-find.sh
index 73bbe65b..11c1a1a1 100755
--- a/tests/run-debuginfod-find.sh
+++ b/tests/run-debuginfod-find.sh
@@ -44,6 +44,7 @@ cleanup()
   if [ $PID2 -ne 0 ]; then kill $PID2; wait $PID2; fi
   if [ $PID3 -ne 0 ]; then kill $PID3; wait $PID3; fi
   if [ $PID4 -ne 0 ]; then kill $PID4; wait $PID4; fi
+  sleep 5; # Give time to ensure debuginfod cleans and closes sqlite databases.
   rm -rf F R D L Z ${PWD}/foobar ${PWD}/mocktree ${PWD}/.client_cache* ${PWD}/tmp*
   exit_cleanup
 }
@@ -54,7 +55,7 @@ trap cleanup 0 1 2 3 5 9 15
 errfiles_list=
 err() {
 echo ERROR REPORTS
-for ports in $PORT1 $PORT2
+for ports in $PORT1 $PORT2 $PORT3
 do
 echo ERROR REPORT $port metrics
 curl -s http://127.0.0.1:$port/metrics
@@ -149,18 +150,18 @@ ps -q $PID1 -e -L -o '%p %c %a' | grep traverse
 # Compile a simple program, strip its debuginfo and save the build-id.
 # Also move the debuginfo into another directory so that elfutils
 # cannot find it without debuginfod.
-echo "int main() { return 0; }" > ${PWD}/prog.c
-tempfiles prog.c
+echo "int main() { return 0; }" > ${PWD}/p+r%o\$g.c
+tempfiles p+r%o\$g.c
 # Create a subdirectory to confound source path names
 mkdir foobar
-gcc -Wl,--build-id -g -o prog ${PWD}/foobar///./../prog.c
-testrun ${abs_top_builddir}/src/strip -g -f prog.debug ${PWD}/prog
+gcc -Wl,--build-id -g -o p+r%o\$g ${PWD}/foobar///./../p+r%o\$g.c
+testrun ${abs_top_builddir}/src/strip -g -f p+r%o\$g.debug ${PWD}/p+r%o\$g
 BUILDID=`en

[Bug debuginfod/27982] debuginfod client maximum-transfer-size and -time parameters

2021-07-19 Thread nsanci at redhat dot com via Elfutils-devel
https://sourceware.org/bugzilla/show_bug.cgi?id=27982

Noah Sanci  changed:

   What|Removed |Added

 CC||nsanci at redhat dot com
   Assignee|unassigned at sourceware dot org   |nsanci at redhat dot com

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Buildbot failure in Wildebeest Builder on whole buildset

2021-07-19 Thread buildbot
The Buildbot has detected a new failure on builder elfutils-centos-x86_64 while 
building elfutils.
Full details are available at:
https://builder.wildebeest.org/buildbot/#builders/1/builds/790

Buildbot URL: https://builder.wildebeest.org/buildbot/

Worker for this Build: centos-x86_64

Build Reason: 
Blamelist: Mark Wielaard 

BUILD FAILED: failed test (failure)

Sincerely,
 -The Buildbot