[Bug libelf/28101] New: elf_strptr slow with address sanitizer, passes entire section range to memrchr.
https://sourceware.org/bugzilla/show_bug.cgi?id=28101 Bug ID: 28101 Summary: elf_strptr slow with address sanitizer, passes entire section range to memrchr. Product: elfutils Version: unspecified Status: UNCONFIRMED Severity: enhancement Priority: P2 Component: libelf Assignee: unassigned at sourceware dot org Reporter: jan.smets at nokia dot com CC: elfutils-devel at sourceware dot org Target Milestone: --- This code block calls out to validate_str which calls to memrchr. The to= is the size of the entire section. I believe Address Sanitizer does validate the entire memory range. Passing the entire range to the memrchr null char check makes it very costly. 191 else if (likely (strscn->data_list_rear == NULL)) 192 { 193 // XXX The above is currently correct since elf_newdata will 194 // make sure to convert the rawdata into the datalist if 195 // necessary. But it would be more efficient to keep the rawdata 196 // unconverted and only then iterate over the rest of the (newly 197 // added data) list. Note that when the ELF file is mmapped 198 // rawdata_base can be set while rawdata.d hasn't been 199 // initialized yet (when data_read is zero). So we cannot just 200 // look at the rawdata.d.d_size. 201 202 /* Make sure the string is NUL terminated. Start from the end, 203 which very likely is a NUL char. */ 204 if (likely (validate_str (strscn->rawdata_base, offset, sh_size))) 205 result = &strscn->rawdata_base[offset]; 206 else 207 __libelf_seterrno (ELF_E_INVALID_INDEX); 208 } * libelf compiled with HAVE_DECL_MEMRCHR (default) * recent GCC toolchain (GCC6 and up) * files themselves don't even need to be compiled with asan, just enabling it at link time so the runtime wrapping/intercepting of libc et all fires. Testcase: test.c is attached. Generate a source file with dummy symbols to populate the symbol table. for i in {A..Z}{A..Z}{A..Z}{A..Z}; do echo "int sym$i;"; done > symbols.c gcc -c symbols.c -o symbols.o gcc -c test.c -o test.o -I /tmp/jan-test/elfutils/libelf/ #-fsanitize=address gcc test.o symbols.o -o test -l:libelf.a -L/tmp/jan-test/elfutils/libelf/ -lz -fsanitize=address echo -n "default asan:" time ./test test echo -n "asan disabled" gcc test.o symbols.o -o test -l:libelf.a -L/tmp/jan-test/elfutils/libelf/ -lz time ./test test I don't know of any ASAN_OPTIONS parameter that would change this behavior and make it less strict. On my machine the asan testcase takes 5+ sec, whereas the non-asan 0.04s. Can this code please be optimized to reduce the range passed to memrchr? Is this NUL check even required? -- You are receiving this mail because: You are on the CC list for the bug.
[Bug libelf/28101] elf_strptr slow with address sanitizer, passes entire section range to memrchr.
https://sourceware.org/bugzilla/show_bug.cgi?id=28101 Mark Wielaard changed: What|Removed |Added CC||mark at klomp dot org --- Comment #1 from Mark Wielaard --- I think it really is a bug/performance issue in asan. But "optimizing" it in libelf by first checking the last char is zero, before calling memrchr wouldn't hurt (and should normally prevent a function call). Does the following help? diff --git a/libelf/elf_strptr.c b/libelf/elf_strptr.c index 76f2caf1..dc9b76c0 100644 --- a/libelf/elf_strptr.c +++ b/libelf/elf_strptr.c @@ -56,7 +56,9 @@ get_zdata (Elf_Scn *strscn) static bool validate_str (const char *str, size_t from, size_t to) { #if HAVE_DECL_MEMRCHR - return memrchr (&str[from], '\0', to - from) != NULL; + // Check end first, which is likely a zero terminator, to prevent function call + return (str[to - 1] == '\0' + || (to - from > 0 && memrchr (&str[from], '\0', to - from - 1) != NULL)); #else do { if (to <= from) -- You are receiving this mail because: You are on the CC list for the bug.
[Bug libelf/28101] elf_strptr slow with address sanitizer, passes entire section range to memrchr.
https://sourceware.org/bugzilla/show_bug.cgi?id=28101 --- Comment #2 from Jan Smets --- The patch optimizes perfectly and avoids the expensive call. Thanks! -- You are receiving this mail because: You are on the CC list for the bug.
Re: [Bug debuginfod/27983] ignore duplicate urls
Hello, On Wed, Jul 14, 2021 at 12:36 PM Mark Wielaard wrote: > You deduplicate the full URLs after they are fully constructed. Would > it make sense to do the deduplication on server_url, maybe even as > part of the Count number of URLs code? That might make the code > simpler. And you can change num_urls upfront. Deduplication before fully building the URL would work well, however I was concerned about the slashbuildid situation. I would need to alter all urls to either have a trailing '/' or no trailing '/' to ensure comparison between 'http://127.0.0.1:8000/' and 'http://127.0.0.1:8000' is considered equal. This is possible, but I ultimately decided to wait until full construction as those issues would have been handled. I would be glad to make the change if you want. > > > + num_urls = unduplicated_urls; > > + data = reallocarray( (void *) data, num_urls, sizeof(struct > > handle_data)); > > Maybe this reallocarray is unnecessary. Yes, it might save a little > bit of memory, but you do have to handle reallocarray failure. Good to know. I removed it. Thanks Noah From be4e07a8732dd688595b99f92ba275ef5060a34d Mon Sep 17 00:00:00 2001 From: Noah Sanci Date: Fri, 9 Jul 2021 14:53:10 -0400 Subject: [PATCH] debuginfod: PR27983 - ignore duplicate urls Gazing at server logs, one sees a minority of clients who appear to have duplicate query traffic coming in: the same URL, milliseconds apart. Chances are the user accidentally doubled her $DEBUGINFOD_URLS somehow, and the client library is dutifully asking the servers TWICE. Bug #27863 reduces the pain on the servers' CPU, but dupe network traffic is still being paid. We should reject sending outright duplicate concurrent traffic. https://sourceware.org/bugzilla/show_bug.cgi?id=27983 Signed-off-by: Noah Sanci --- debuginfod/ChangeLog | 7 + debuginfod/debuginfod-client.c | 55 +- tests/ChangeLog| 5 tests/run-debuginfod-find.sh | 13 4 files changed, 66 insertions(+), 14 deletions(-) diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog index d9d11737..24ccb8ef 100644 --- a/debuginfod/ChangeLog +++ b/debuginfod/ChangeLog @@ -1,3 +1,10 @@ +2021-07-09 Noah Sanci + + * debuginfod-client.c (debuginfod_query_server): As full-length + urls are generated with standardized formats, ignore duplicates. + Also update the number of urls to the unduplicated number of + urls. + 2021-06-18 Mark Wielaard * debuginfod-client.c (debuginfod_begin): Don't use client if diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c index f417b40a..a9447cae 100644 --- a/debuginfod/debuginfod-client.c +++ b/debuginfod/debuginfod-client.c @@ -795,6 +795,7 @@ debuginfod_query_server (debuginfod_client *c, char *strtok_saveptr; char *server_url = strtok_r(server_urls, url_delim, &strtok_saveptr); + int unduplicated_urls = 0; /* Initialize each handle. */ for (int i = 0; i < num_urls && server_url != NULL; i++) @@ -802,34 +803,59 @@ debuginfod_query_server (debuginfod_client *c, if (vfd >= 0) dprintf (vfd, "init server %d %s\n", i, server_url); - data[i].fd = fd; - data[i].target_handle = &target_handle; - data[i].handle = curl_easy_init(); - if (data[i].handle == NULL) -{ - rc = -ENETUNREACH; - goto out1; -} - data[i].client = c; - - /* Build handle url. Tolerate both http://foo:999 and - http://foo:999/ forms */ char *slashbuildid; if (strlen(server_url) > 1 && server_url[strlen(server_url)-1] == '/') slashbuildid = "buildid"; else slashbuildid = "/buildid"; + char *tmp_url = calloc(PATH_MAX, sizeof(char)); if (filename) /* must start with / */ -snprintf(data[i].url, PATH_MAX, "%s%s/%s/%s%s", server_url, +snprintf(tmp_url, PATH_MAX, "%s%s/%s/%s%s", server_url, slashbuildid, build_id_bytes, type, filename); else -snprintf(data[i].url, PATH_MAX, "%s%s/%s/%s", server_url, +snprintf(tmp_url, PATH_MAX, "%s%s/%s/%s", server_url, slashbuildid, build_id_bytes, type); + /* PR 27983: If the url is already set to be used use, skip it */ + int url_index = -1; + for (url_index = 0; url_index < i; ++url_index) +{ + if(strcmp(tmp_url, data[url_index].url) == 0) +{ + url_index = -1; + break; +} +} + if (url_index == -1) +{ + if (vfd >= 0) +dprintf(vfd, "duplicate url: %s, skipping\n", tmp_url); + free(tmp_url); + // Ensure that next iteration doesn't skip over an index mid-array + i--; + server_url = strtok_r(NULL, url_delim, &strtok_saveptr); + continue; +} + else +{ + unduplicated_urls++; + strncpy(data[i].url, tmp_url, PATH_MAX); + fr
[Bug libelf/28101] elf_strptr slow with address sanitizer, passes entire section range to memrchr.
https://sourceware.org/bugzilla/show_bug.cgi?id=28101 Mark Wielaard changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #3 from Mark Wielaard --- (In reply to Jan Smets from comment #2) > The patch optimizes perfectly and avoids the expensive call. Thanks for testing, I pushed a variant of the fix as: commit 0aed4315b2f6c54f4efcf8a8d22e59a36e6eb30d Author: Mark Wielaard Date: Mon Jul 19 15:52:51 2021 +0200 libelf: Optimize elf_strptr.c validate_str by checking last char first In most cases the last char of the sectio will be zero. Check that first before calling memrchr. This is a minor optimization in normal cases. But it helps asan a lot by removing the memrchr call in most cases. https://sourceware.org/bugzilla/show_bug.cgi?id=28101 Signed-off-by: Mark Wielaard -- You are receiving this mail because: You are on the CC list for the bug.
[Bug debuginfod/28034] %-escape url characters
Hello, Please find the patch for bug 28034 attached. Noah From 1e2340a9732bdc8f9a7a207a870e6815c770c23c Mon Sep 17 00:00:00 2001 From: Noah Sanci Date: Fri, 16 Jul 2021 15:16:20 -0400 Subject: [PATCH] debuginfod: PR28034 - %-escape url characters When requesting some source files, some URL-inconvenient chars sometimes pop up. Example from f33 libstdc++: /buildid/44d8485cb75512c2ca5c8f70afbd475cae30af4f/source/usr/src/debug/ gcc-10.3.1-1.fc33.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/ libstdc++-v3/src/c++11/../../../../../libstdc++-v3/src/c++11/ condition_variable.cc As this URL is passed into debuginfod's handler_cb, it appears that the + signs are helpfully unescaped to spaces by libmicrohttpd, which 'course breaks everything. We need to suppress this HTTP URL processing step. Also worth checking that %HEX decoding is also suppressed. https://sourceware.org/bugzilla/show_bug.cgi?id=28034 Signed-off-by: Noah Sanci --- debuginfod/ChangeLog | 6 ++ debuginfod/debuginfod-client.c | 12 --- doc/debuginfod.8 | 1 + tests/ChangeLog| 6 ++ tests/run-debuginfod-find.sh | 37 ++ 5 files changed, 42 insertions(+), 20 deletions(-) diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog index e71436ca..aad35a4e 100644 --- a/debuginfod/ChangeLog +++ b/debuginfod/ChangeLog @@ -1,3 +1,9 @@ +2021-07-16 Noah Sanci + + PR28034 + * debuginfod-client.c (debuginfod_query_server): % escape filename + so the completed url is processed properly. + 2021-07-06 Alice Zhang PR27531 diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c index f12f609c..e30f73eb 100644 --- a/debuginfod/debuginfod-client.c +++ b/debuginfod/debuginfod-client.c @@ -831,9 +831,15 @@ debuginfod_query_server (debuginfod_client *c, else slashbuildid = "/buildid"; - if (filename) /* must start with / */ -snprintf(data[i].url, PATH_MAX, "%s%s/%s/%s%s", server_url, - slashbuildid, build_id_bytes, type, filename); + if (filename)/* must start with / */ +{ + /* PR28034 escape characters in completed url to %hh format. */ + char *escaped_string; + escaped_string = curl_easy_escape(data[i].handle, filename, 0); + snprintf(data[i].url, PATH_MAX, "%s%s/%s/%s%s", server_url, + slashbuildid, build_id_bytes, type, escaped_string); + curl_free(escaped_string); +} else snprintf(data[i].url, PATH_MAX, "%s%s/%s/%s", server_url, slashbuildid, build_id_bytes, type); diff --git a/doc/debuginfod.8 b/doc/debuginfod.8 index 1adf703a..90cdcf94 100644 --- a/doc/debuginfod.8 +++ b/doc/debuginfod.8 @@ -299,6 +299,7 @@ l l. \../bar/foo.c AT_comp_dir=/zoo/ /buildid/BUILDID/source/zoo//../bar/foo.c .TE +Note: the client should %-escape characters in /SOURCE/FILE that are not shown as "unreserved" in section 2.3 of RFC3986. .SS /metrics This endpoint returns a Prometheus formatted text/plain dump of a diff --git a/tests/ChangeLog b/tests/ChangeLog index 1460b422..cfa0dee4 100644 --- a/tests/ChangeLog +++ b/tests/ChangeLog @@ -1,3 +1,9 @@ +2021-07-16 Noah Sanci + + PR28034 + * run-debuginfod-find.sh: Added a test case ensuring files with % + escapable characters in their paths are accessible. + 2021-07-06 Alice Zhang PR27531 diff --git a/tests/run-debuginfod-find.sh b/tests/run-debuginfod-find.sh index 73bbe65b..11c1a1a1 100755 --- a/tests/run-debuginfod-find.sh +++ b/tests/run-debuginfod-find.sh @@ -44,6 +44,7 @@ cleanup() if [ $PID2 -ne 0 ]; then kill $PID2; wait $PID2; fi if [ $PID3 -ne 0 ]; then kill $PID3; wait $PID3; fi if [ $PID4 -ne 0 ]; then kill $PID4; wait $PID4; fi + sleep 5; # Give time to ensure debuginfod cleans and closes sqlite databases. rm -rf F R D L Z ${PWD}/foobar ${PWD}/mocktree ${PWD}/.client_cache* ${PWD}/tmp* exit_cleanup } @@ -54,7 +55,7 @@ trap cleanup 0 1 2 3 5 9 15 errfiles_list= err() { echo ERROR REPORTS -for ports in $PORT1 $PORT2 +for ports in $PORT1 $PORT2 $PORT3 do echo ERROR REPORT $port metrics curl -s http://127.0.0.1:$port/metrics @@ -149,18 +150,18 @@ ps -q $PID1 -e -L -o '%p %c %a' | grep traverse # Compile a simple program, strip its debuginfo and save the build-id. # Also move the debuginfo into another directory so that elfutils # cannot find it without debuginfod. -echo "int main() { return 0; }" > ${PWD}/prog.c -tempfiles prog.c +echo "int main() { return 0; }" > ${PWD}/p+r%o\$g.c +tempfiles p+r%o\$g.c # Create a subdirectory to confound source path names mkdir foobar -gcc -Wl,--build-id -g -o prog ${PWD}/foobar///./../prog.c -testrun ${abs_top_builddir}/src/strip -g -f prog.debug ${PWD}/prog +gcc -Wl,--build-id -g -o p+r%o\$g ${PWD}/foobar///./../p+r%o\$g.c +testrun ${abs_top_builddir}/src/strip -g -f p+r%o\$g.debug ${PWD}/p+r%o\$g BUILDID=`en
[Bug debuginfod/27982] debuginfod client maximum-transfer-size and -time parameters
https://sourceware.org/bugzilla/show_bug.cgi?id=27982 Noah Sanci changed: What|Removed |Added CC||nsanci at redhat dot com Assignee|unassigned at sourceware dot org |nsanci at redhat dot com -- You are receiving this mail because: You are on the CC list for the bug.
Buildbot failure in Wildebeest Builder on whole buildset
The Buildbot has detected a new failure on builder elfutils-centos-x86_64 while building elfutils. Full details are available at: https://builder.wildebeest.org/buildbot/#builders/1/builds/790 Buildbot URL: https://builder.wildebeest.org/buildbot/ Worker for this Build: centos-x86_64 Build Reason: Blamelist: Mark Wielaard BUILD FAILED: failed test (failure) Sincerely, -The Buildbot