Hi Romain, On Tue, 2023-06-20 at 22:05 +0000, Romain GEISSLER wrote: > > Our real use case happens on a Openshift 4.13 node, so the OS is Red Hat Core > OS 9 (which I assume shares a lot of foundations with RHEL 9). > > On our side Francois also told me this afternoon that he didn’t really > reproduce the same thing with my reproducer posted here and the real > systemd-coredump issue he witnessed live, and also noticed that with > DEBUGINFOD_URLS unset/set to the empty string my reproducer has no problem > anymore. What he witnessed on the real case (using perf/gdb) was that > apparently lots of time was spend in elf_getdata_rawchunk and often in this > kind of stack: > > Samples: 65K of event 'cpu-clock:pppH', Event count (approx.): 16468500000 > > > Overhead Command Shared Object Symbol > > > 98.24% (sd-parse-elf) libelf-0.188.so [.] elf_getdata_rawchunk > 0.48% (sd-parse-elf) libelf-0.188.so [.] 0x00000000000048a3 > 0.27% (sd-parse-elf) libelf-0.188.so [.] gelf_getphdr > 0.11% (sd-parse-elf) libc.so.6 [.] _int_malloc > 0.10% (sd-parse-elf) libelf-0.188.so [.] gelf_getnote > 0.06% (sd-parse-elf) libc.so.6 [.] __libc_calloc > 0.05% (sd-parse-elf) [kernel.kallsyms] [k] > __softirqentry_text_start > 0.05% (sd-parse-elf) libc.so.6 [.] _int_free > > > (gdb) bt > #0 0x00007f0ba8a88194 in elf_getdata_rawchunk () from > target:/lib64/libelf.so.1 > #1 0x00007f0ba98e5013 in module_callback.lto_priv () from > target:/usr/lib64/systemd/libsystemd-shared-252.so > #2 0x00007f0ba8ae7291 in dwfl_getmodules () from target:/lib64/libdw.so.1 > #3 0x00007f0ba98e6dc0 in parse_elf_object () from > target:/usr/lib64/systemd/libsystemd-shared-252.so > #4 0x0000562c474f2d5e in submit_coredump () > #5 0x0000562c474f57d1 in process_socket.constprop () > #6 0x0000562c474efbf8 in main () > > My reproducer actually doesn’t fully re-implement what systemd implements > (the parsing of the package metadata is clearly omitted), so I thought I had > reproduced the same problem while apparently I didn’t, sorry for that. We > will also have to double check if really just using 2000 dummy libraries is > enough or if this also needs to have a more complex binary like we have in > our real case. > > Tomorrow on our side we will have to play a bit with a local build of > systemd-coredump and try to run it manually to better understand what’s going > wrong. >
Seeing those performance results I understand why you were suspecting the linked list data structure used in elf_getdata_rawchunk. Would you be able to test a rebuild libelf with the attached patch, which replaces that datastructure with a binary search tree? It didn't really show much speedup locally (in the noise, maybe 0.01 sec faster on ~0.25 sec run). But if there are more than 2000 calls to elf_getdata_rawchunk it should make things faster. Thanks, Mark
From 3aca5b5f1f1617db2220022d9061dcaf129e54c4 Mon Sep 17 00:00:00 2001 From: Mark Wielaard <m...@klomp.org> Date: Wed, 21 Jun 2023 18:05:12 +0200 Subject: [PATCH] libelf: Replace list of elf_getdata_rawchunk results with a tree elf_getdata_rawchunks did a linear search to see if a chunk was already fetched. Replace this list with a binary search tree to make lookup faster when a lot of Elf_Data_Chunk were created. * libelf/libelfP.h (Elf_Data_Chunk): Remove next field. (struct Elf): Change the rawchunks type from Elf_Data_Chunk * to void *. * elf_getdata_rawchunk.c (chunk_compare): New static function. (elf_getdata_rawchunk): Use tsearch instead of a manual linked list. * elf_end.c (free_chunk): New static function. (elf_end): Call tdestroy instead of walking linked list. Signed-off-by: Mark Wielaard <m...@klomp.org> --- libelf/elf_end.c | 22 +++++++++------- libelf/elf_getdata_rawchunk.c | 47 +++++++++++++++++++++++++---------- libelf/libelfP.h | 13 ++++------ 3 files changed, 52 insertions(+), 30 deletions(-) diff --git a/libelf/elf_end.c b/libelf/elf_end.c index 5c451f36..3e5d4c86 100644 --- a/libelf/elf_end.c +++ b/libelf/elf_end.c @@ -1,5 +1,6 @@ /* Free resources associated with Elf descriptor. Copyright (C) 1998,1999,2000,2001,2002,2004,2005,2007,2015,2016 Red Hat, Inc. + Copyright (C) 2023 Mark J. Wielaard <m...@klomp.org> This file is part of elfutils. Written by Ulrich Drepper <drep...@redhat.com>, 1998. @@ -32,12 +33,22 @@ #endif #include <assert.h> +#include <search.h> #include <stddef.h> #include <stdlib.h> #include "libelfP.h" +static void +free_chunk (void *n) +{ + Elf_Data_Chunk *rawchunk = (Elf_Data_Chunk *)n; + if (rawchunk->dummy_scn.flags & ELF_F_MALLOCED) + free (rawchunk->data.d.d_buf); + free (rawchunk); +} + int elf_end (Elf *elf) { @@ -112,20 +123,13 @@ elf_end (Elf *elf) case ELF_K_ELF: { - Elf_Data_Chunk *rawchunks + void *rawchunks = (elf->class == ELFCLASS32 || (offsetof (struct Elf, state.elf32.rawchunks) == offsetof (struct Elf, state.elf64.rawchunks)) ? elf->state.elf32.rawchunks : elf->state.elf64.rawchunks); - while (rawchunks != NULL) - { - Elf_Data_Chunk *next = rawchunks->next; - if (rawchunks->dummy_scn.flags & ELF_F_MALLOCED) - free (rawchunks->data.d.d_buf); - free (rawchunks); - rawchunks = next; - } + tdestroy (rawchunks, free_chunk); Elf_ScnList *list = (elf->class == ELFCLASS32 || (offsetof (struct Elf, state.elf32.scns) diff --git a/libelf/elf_getdata_rawchunk.c b/libelf/elf_getdata_rawchunk.c index 5a35ccdc..cfd40396 100644 --- a/libelf/elf_getdata_rawchunk.c +++ b/libelf/elf_getdata_rawchunk.c @@ -1,6 +1,6 @@ /* Return converted data from raw chunk of ELF file. Copyright (C) 2007, 2014, 2015 Red Hat, Inc. - Copyright (C) 2022 Mark J. Wielaard <m...@klomp.org> + Copyright (C) 2022, 2023 Mark J. Wielaard <m...@klomp.org> This file is part of elfutils. This file is free software; you can redistribute it and/or modify @@ -33,12 +33,28 @@ #include <assert.h> #include <errno.h> +#include <search.h> #include <stdlib.h> #include <string.h> #include "libelfP.h" #include "common.h" +static int +chunk_compare (const void *a, const void *b) +{ + Elf_Data_Chunk *da = (Elf_Data_Chunk *)a; + Elf_Data_Chunk *db = (Elf_Data_Chunk *)b; + + if (da->offset != db->offset) + return da->offset - db->offset; + + if (da->data.d.d_size != db->data.d.d_size) + return da->data.d.d_size - db->data.d.d_size; + + return da->data.d.d_type - db->data.d.d_type; +} + Elf_Data * elf_getdata_rawchunk (Elf *elf, int64_t offset, size_t size, Elf_Type type) { @@ -75,19 +91,25 @@ elf_getdata_rawchunk (Elf *elf, int64_t offset, size_t size, Elf_Type type) rwlock_rdlock (elf->lock); /* Maybe we already got this chunk? */ - Elf_Data_Chunk *rawchunks = elf->state.elf.rawchunks; - while (rawchunks != NULL) + Elf_Data_Chunk key; + key.offset = offset; + key.data.d.d_size = size; + key.data.d.d_type = type; + Elf_Data_Chunk **found = tsearch (&key, &elf->state.elf.rawchunks, + &chunk_compare); + if (found == NULL) + goto nomem; + + /* Existing entry. */ + if (*found != &key && *found != NULL) { - if ((rawchunks->offset == offset || size == 0) - && rawchunks->data.d.d_size == size - && rawchunks->data.d.d_type == type) - { - result = &rawchunks->data.d; - goto out; - } - rawchunks = rawchunks->next; + result = &(*found)->data.d; + goto out; } + /* New entry. */ + *found = NULL; + size_t align = __libelf_type_align (elf->class, type); if (elf->map_address != NULL) { @@ -189,8 +211,7 @@ elf_getdata_rawchunk (Elf *elf, int64_t offset, size_t size, Elf_Type type) rwlock_unlock (elf->lock); rwlock_wrlock (elf->lock); - chunk->next = elf->state.elf.rawchunks; - elf->state.elf.rawchunks = chunk; + *found = chunk; result = &chunk->data.d; out: diff --git a/libelf/libelfP.h b/libelf/libelfP.h index 6624f38a..d3c241e5 100644 --- a/libelf/libelfP.h +++ b/libelf/libelfP.h @@ -1,5 +1,6 @@ /* Internal interfaces for libelf. Copyright (C) 1998-2010, 2015, 2016 Red Hat, Inc. + Copyright (C) 2023 Mark J. Wielaard <m...@klomp.org> This file is part of elfutils. Contributed by Ulrich Drepper <drep...@redhat.com>, 1998. @@ -262,11 +263,7 @@ typedef struct Elf_ScnList typedef struct Elf_Data_Chunk { Elf_Data_Scn data; - union - { - Elf_Scn dummy_scn; - struct Elf_Data_Chunk *next; - }; + Elf_Scn dummy_scn; int64_t offset; /* The original raw offset in the Elf image. */ } Elf_Data_Chunk; @@ -324,7 +321,7 @@ struct Elf Elf_ScnList *scns_last; /* Last element in the section list. If NULL the data has not yet been read from the file. */ - Elf_Data_Chunk *rawchunks; /* List of elf_getdata_rawchunk results. */ + void *rawchunks; /* Tree of elf_getdata_rawchunk results. */ unsigned int scnincr; /* Number of sections allocate the last time. */ int ehdr_flags; /* Flags (dirty) for ELF header. */ @@ -343,7 +340,7 @@ struct Elf Elf_ScnList *scns_last; /* Last element in the section list. If NULL the data has not yet been read from the file. */ - Elf_Data_Chunk *rawchunks; /* List of elf_getdata_rawchunk results. */ + void *rawchunks; /* Tree of elf_getdata_rawchunk results. */ unsigned int scnincr; /* Number of sections allocate the last time. */ int ehdr_flags; /* Flags (dirty) for ELF header. */ @@ -368,7 +365,7 @@ struct Elf Elf_ScnList *scns_last; /* Last element in the section list. If NULL the data has not yet been read from the file. */ - Elf_Data_Chunk *rawchunks; /* List of elf_getdata_rawchunk results. */ + void *rawchunks; /* Tree of elf_getdata_rawchunk results. */ unsigned int scnincr; /* Number of sections allocate the last time. */ int ehdr_flags; /* Flags (dirty) for ELF header. */ -- 2.40.1