Re: [PATCH v2 06/12] libdwfl [6/12]: Elf* caching via dwfl_process_tracker

2025-04-15 Thread Serhei Makarov



On Tue, Apr 15, 2025, at 2:15 PM, Christian Hergert wrote:
> On 4/15/25 09:12, Serhei Makarov wrote:
>> Option 2: expose one function, treat file_name and fd as optional input 
>> parameters.
>> 
>> // normal use case
>> module_name = "/usr/lib/whatever.so"; /* name of module */
>> file_name = "/proc/PID/root/usr/lib/whatever.so"; /* actual location of 
>> module */
>> dwfl_process_tracker_find_cached_elf (tracker, module_name, &file_name, 
>> &elf, &fd);
>> /* Since file_name is specified, the dwfl_process_tracker
>> will access the module at file_name, and cache by
>> module_name + dev + ino. */
>
> What allocator owns `file_name`? Can they be const in both directions?
This probably shows my suggested API is unclear
since the input value of file_name is separate from the output value.
(The input value is a const string, owned by the caller, not retained by the 
callee.
The output value is a newly allocated string.
Could be made to work, but will be too confusing to document.)

Would be better to add a separate arg. Thus, Option 3:

dwfl_process_tracker_find_cached_elf (tracker,
const char *module_name, /* -> /usr/lib/whatever.so */
const char *module_location, /* -> /proc/PID/root/usr/lib/whatever.so,
or could be NULL if module_name is the valid path */
Dwarf_Addr base,
char **file_name, /* OUTPUT, newly allocated string */
Elf **elfp, /* OUTPUT, elf struct, owned by elfutils */
int *fdp /* OUTPUT, fd, owned by elfutils */);
 
/* and the corresponding API to save a newly created Elf */
dwfl_process_tracker_cache_elf (tracker,
const char *module_name,
const char *file_name, /* -> /proc/PID/root/usr/lib/whatever.so */
Elf *elf, int fd);

> Is ownership of FD transferred to `dup()`d?
Not sure about the exact meaning of the question.
The relevant code is inside the sysprof_live_process_find_elf callback,
and the FD is returned to the caller of that callback, within Elfutils.
Elfutils handles the FD ownership, so
it should not necessary to duplicate the FD on the Sysprof side,
unless Sysprof wants to use the FD for its own purposes independently
of what Elfutils is doing with it.

-- 
All the best,
Serhei
http://serhei.io


Re: [PATCH] src/readelf.c: Access symbol and version data only if available

2025-04-15 Thread Mark Wielaard
Hi Aaron, Hi Constantine,

On Thu, 2025-04-10 at 10:52 -0400, Aaron Merey wrote:
> handle_dynamic_symtab can attempt to read symbol and version data from
> file offset 0 if the associated DT_ tags aren't found.
> 
> Fix this by only reading symbol and version data when non-zero file
> offsets have been found.
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=32864

So I think the basic observation that elf_getdata_rawchunk happily
returns a chunk of the ELF file starting at offset zero (which is
basically never what you want) is correct. Given that symdata,
symstrdata, versym_data, verdef_data and verneed_data are initialized
to NULL, checking for the offs index to not be zero before trying to
call elf_getdata_rawchunk makes sense.

It is a good update.

But note that the code itself isn't very robust against bad input. Both
verneed_data->d_buf and verdef_data->d_buf are used without checking
whether verneed_data or verdef_data is NULL. At least against bad input
data. It looks to me as if addrs[i_verdefnum] (DT_VERDEFNUM) and
addrs[i_verneednum] (DT_VERNEEDNUM) could be set without
offs[i_verneed] or offs[iverdef] being set. So maybe explicitly also
check that (or file a bug report to check that in the future)?

Thanks,

Mark

> Suggested-by: Constantine Bytensky 
> Signed-off-by: Aaron Merey 
> ---
>  src/readelf.c | 50 +++---
>  1 file changed, 31 insertions(+), 19 deletions(-)
> 
> diff --git a/src/readelf.c b/src/readelf.c
> index 5b0e7b47..0452977f 100644
> --- a/src/readelf.c
> +++ b/src/readelf.c
> @@ -3047,18 +3047,25 @@ handle_dynamic_symtab (Ebl *ebl)
>Elf_Data *verdef_data = NULL;
>Elf_Data *verneed_data = NULL;
>  
> -  symdata = elf_getdata_rawchunk (
> -  ebl->elf, offs[i_symtab],
> -  gelf_fsize (ebl->elf, ELF_T_SYM, syments, EV_CURRENT), ELF_T_SYM);
> -  symstrdata = elf_getdata_rawchunk (ebl->elf, offs[i_strtab], 
> addrs[i_strsz],
> - ELF_T_BYTE);
> -  versym_data = elf_getdata_rawchunk (
> -  ebl->elf, offs[i_versym], syments * sizeof (Elf64_Half), ELF_T_HALF);
> +  if (offs[i_symtab] != 0)
> +symdata = elf_getdata_rawchunk (
> + ebl->elf, offs[i_symtab],
> + gelf_fsize (ebl->elf, ELF_T_SYM, syments, EV_CURRENT), ELF_T_SYM);
> +
> +  if (offs[i_strtab] != 0)
> +symstrdata = elf_getdata_rawchunk (ebl->elf, offs[i_strtab], 
> addrs[i_strsz],
> +ELF_T_BYTE);
> +
> +  if (offs[i_versym] != 0)
> +versym_data = elf_getdata_rawchunk (
> + ebl->elf, offs[i_versym], syments * sizeof (Elf64_Half), ELF_T_HALF);
>  
>/* Get the verneed_data without vernaux.  */
> -  verneed_data = elf_getdata_rawchunk (
> -  ebl->elf, offs[i_verneed], addrs[i_verneednum] * sizeof 
> (Elf64_Verneed),
> -  ELF_T_VNEED);
> +  if (offs[i_verneed] != 0)
> +verneed_data = elf_getdata_rawchunk (
> + ebl->elf, offs[i_verneed], addrs[i_verneednum] * sizeof (Elf64_Verneed),
> + ELF_T_VNEED);
> +
>size_t vernauxnum = 0;
>size_t vn_next_offset = 0;
>  
> @@ -3071,14 +3078,18 @@ handle_dynamic_symtab (Ebl *ebl)
>  }
>  
>/* Update the verneed_data to include the vernaux.  */
> -  verneed_data = elf_getdata_rawchunk (
> -  ebl->elf, offs[i_verneed],
> -  (addrs[i_verneednum] + vernauxnum) * sizeof (GElf_Verneed), 
> ELF_T_VNEED);
> +  if (offs[i_verneed] != 0)
> +verneed_data = elf_getdata_rawchunk (
> + ebl->elf, offs[i_verneed],
> + (addrs[i_verneednum] + vernauxnum) * sizeof (GElf_Verneed),
> + ELF_T_VNEED);
>  
>/* Get the verdef_data without verdaux.  */
> -  verdef_data = elf_getdata_rawchunk (
> -  ebl->elf, offs[i_verdef], addrs[i_verdefnum] * sizeof (Elf64_Verdef),
> -  ELF_T_VDEF);
> +  if (offs[i_verdef] != 0)
> +verdef_data = elf_getdata_rawchunk (
> + ebl->elf, offs[i_verdef], addrs[i_verdefnum] * sizeof (Elf64_Verdef),
> + ELF_T_VDEF);
> +
>size_t verdauxnum = 0;
>size_t vd_next_offset = 0;
>  
> @@ -3091,9 +3102,10 @@ handle_dynamic_symtab (Ebl *ebl)
>  }
>  
>/* Update the verdef_data to include the verdaux.  */
> -  verdef_data = elf_getdata_rawchunk (
> -  ebl->elf, offs[i_verdef],
> -  (addrs[i_verdefnum] + verdauxnum) * sizeof (GElf_Verdef), ELF_T_VDEF);
> +  if (offs[i_verdef] != 0)
> +verdef_data = elf_getdata_rawchunk (
> + ebl->elf, offs[i_verdef],
> + (addrs[i_verdefnum] + verdauxnum) * sizeof (GElf_Verdef), ELF_T_VDEF);
>  
>unsigned int nsyms = (unsigned int)syments;
>process_symtab (ebl, nsyms, 0, 0, 0, symdata, versym_data, symstrdata,



Re: [PATCH v2 06/12] libdwfl [6/12]: Elf* caching via dwfl_process_tracker

2025-04-15 Thread Serhei Makarov



On Fri, Apr 4, 2025, at 5:04 PM, Serhei Makarov wrote:
> Changes for v2:
>
> - Add locking for elftab. This is needed in addition to the
>   intrinsic locking in dynamicsizehash_concurrent to avoid
>   having cache_elf expose an incomplete dwfltracker_elf_info*
>   entry to other threads while its data is being populated /
>   replaced.
>
> - Tidy dwfl_process_tracker_find_elf.c into the main find_elf
>   callback and two functions to consider (in future) making into
>   a public api for custom cached callbacks.
Note: going to apply one nontrivial change to this patch -- taking dev/ino into 
account in the caching key,
to allow caching modules in container filesystems (which Sysprof accesses via 
/proc/PID/root).
The "/proc/PID/root/MODULE" path is unique per-process, so we need to cache by 
"MODULE+dev+ino"
in order to catch repeated instantiations of a module from a container image.

The concept is fairly clear, but I'm muddled on how to expose it in the public 
API.

Option 1: expose two find_cached_elf functions.

// normal use case
module_name = "/usr/lib/whatever.so";
dwfl_process_tracker_find_cached_elf (tracker, module_name, &file_name, &elf, 
&fd);
/* This will access the file at module_name */

// use case for executable in container
module_name = "/usr/lib/whatever.so";
file_name = "/proc/PID/root/usr/lib/whatever.so";
struct stat sb; stat (file_name, &sb);
dwfl_process_tracker_check_cached_elf_path (tracker, module_name,
sb.st_dev, sb.st_ino, file_name, &elf, &fd);
/* This will access the file at file_name

Option 2: expose one function, treat file_name and fd as optional input 
parameters.

// normal use case
module_name = "/usr/lib/whatever.so"; /* name of module */
file_name = "/proc/PID/root/usr/lib/whatever.so"; /* actual location of module 
*/
dwfl_process_tracker_find_cached_elf (tracker, module_name, &file_name, &elf, 
&fd);
/* Since file_name is specified, the dwfl_process_tracker
   will access the module at file_name, and cache by
   module_name + dev + ino. */

Either way, Sysprof can include its own version of the 
dwfl_process_tracker_find_elf()
function (sysprof_live_process_find_elf) which calls these as appropriate.
I'm leaning towards Option 2 but wanted to present both for consideration.


Re: [PATCH v2 06/12] libdwfl [6/12]: Elf* caching via dwfl_process_tracker

2025-04-15 Thread Christian Hergert

On 4/15/25 09:12, Serhei Makarov wrote:

Option 2: expose one function, treat file_name and fd as optional input 
parameters.

// normal use case
module_name = "/usr/lib/whatever.so"; /* name of module */
file_name = "/proc/PID/root/usr/lib/whatever.so"; /* actual location of module 
*/
dwfl_process_tracker_find_cached_elf (tracker, module_name, &file_name, &elf, 
&fd);
/* Since file_name is specified, the dwfl_process_tracker
will access the module at file_name, and cache by
module_name + dev + ino. */


What allocator owns `file_name`? Can they be const in both directions? 
Is ownership of FD transferred to `dup()`d?


-- Christian



[PATCH] libdw: Fix eu_search_tree TOCTOU bugs

2025-04-15 Thread Aaron Merey
eu_tfind is used to facilitate lazy loading throughout libdw.
If a result is not found via eu_tfind, work is done to load
the result and cache it in an eu_search_tree.

Some calls to eu_tfind allow for TOCTOU bugs.  Multiple threads
might race to call eu_tfind on some result that hasn't yet been
cached.  Multiple threads may then attempt to load the result
which may cause an unnecessary amount of memory may be allocated.
Additionally this memory may not get released when the associated
libdw data structure is freed.

Fix this by adding additional locking to ensure that only one
thread performs lazy loading.

One approach used in this patch is to preserve calls to eu_tfind
without additional locking, but when the result isn't found then
a lock is then used to synchronize access to the lazy loading code.
An extra eu_tfind call has been added at the start of these critical
section to synchronize verification that lazy loading should proceed.

Another approach used is to simply synchronize entire calls to
functions where lazy loading via eu_tfind might occur (__libdw_find_fde
and __libdw_intern_expression).

libdw/
* cfi.h (struct Dwarf_CFI_s): Declare new mutex.
* dwarf_begin_elf.c (valid_p): Initialize all locks for fake CUs.
* dwarf_cfi_addrframe.c (dwarf_cfi_addrframe): Place lock around
__libdw_find_fde.
* dwarf_end.c (cu_free): Deallocate all locks unconditionally,
whether or not the CU is fake.
* dwarf_frame_cfa.c (dwarf_frame_cfa): Place lock around
__libdw_intern_expression.
* dwarf_frame_register.c (dwarf_frame_register): Ditto.
* dwarf_getcfi.c (dwarf_getcfi): Initialize cfi lock.
* dwarf_getlocation.c (is_constant_offset): Synchronize access
to lazy loading section.
(getlocation): Place lock around __libdw_intern_expression.
* dwarf_getmacros.c (cache_op_table): Synchronize access to lazy
loading section.
* frame-cache.c (__libdw_destroy_frame_cache): Free Dwarf_CFI
mutex.
* libdwP.h (struct Dwarf): Update macro_lock comment.
(struct Dwarf_CU): Declare new mutex.
libdw_findcu.c (__libdw_intern_next_unit): Initialize
intern_lock.
(__libdw_findcu): Adjust locking so that the first eu_tfind
can be done without extra lock overhead.

Signed-off-by: Aaron Merey 
---
 libdw/cfi.h  |  4 +++
 libdw/dwarf_begin_elf.c  | 15 +
 libdw/dwarf_cfi_addrframe.c  |  3 ++
 libdw/dwarf_end.c| 10 +++---
 libdw/dwarf_frame_cfa.c  |  2 ++
 libdw/dwarf_frame_register.c | 16 +
 libdw/dwarf_getcfi.c |  1 +
 libdw/dwarf_getlocation.c| 63 ++--
 libdw/dwarf_getmacros.c  | 16 -
 libdw/frame-cache.c  |  1 +
 libdw/libdwP.h   |  6 +++-
 libdw/libdw_findcu.c |  9 --
 12 files changed, 108 insertions(+), 38 deletions(-)

diff --git a/libdw/cfi.h b/libdw/cfi.h
index d0134243..48e42a41 100644
--- a/libdw/cfi.h
+++ b/libdw/cfi.h
@@ -98,6 +98,10 @@ struct Dwarf_CFI_s
   /* Search tree for parsed DWARF expressions, indexed by raw pointer.  */
   search_tree expr_tree;
 
+  /* Should be held when calling __libdw_find_fde and when 
__libdw_intern_expression
+ is called with Dwarf_CFI members.  */
+  mutex_define(, lock);
+
   /* Backend hook.  */
   struct ebl *ebl;
 
diff --git a/libdw/dwarf_begin_elf.c b/libdw/dwarf_begin_elf.c
index 58a309e9..63e2805d 100644
--- a/libdw/dwarf_begin_elf.c
+++ b/libdw/dwarf_begin_elf.c
@@ -359,6 +359,11 @@ valid_p (Dwarf *result)
  result->fake_loc_cu->version = 4;
  result->fake_loc_cu->split = NULL;
  eu_search_tree_init (&result->fake_loc_cu->locs_tree);
+ rwlock_init (result->fake_loc_cu->abbrev_lock);
+ rwlock_init (result->fake_loc_cu->split_lock);
+ mutex_init (result->fake_loc_cu->src_lock);
+ mutex_init (result->fake_loc_cu->str_off_base_lock);
+ mutex_init (result->fake_loc_cu->intern_lock);
}
 }
 
@@ -387,6 +392,11 @@ valid_p (Dwarf *result)
  result->fake_loclists_cu->version = 5;
  result->fake_loclists_cu->split = NULL;
  eu_search_tree_init (&result->fake_loclists_cu->locs_tree);
+ rwlock_init (result->fake_loclists_cu->abbrev_lock);
+ rwlock_init (result->fake_loclists_cu->split_lock);
+ mutex_init (result->fake_loclists_cu->src_lock);
+ mutex_init (result->fake_loclists_cu->str_off_base_lock);
+ mutex_init (result->fake_loclists_cu->intern_lock);
}
 }
 
@@ -420,6 +430,11 @@ valid_p (Dwarf *result)
  result->fake_addr_cu->version = 5;
  result->fake_addr_cu->split = NULL;
  eu_search_tree_init (&result->fake_addr_cu->locs_tree);
+ rwlock_init (result->fake_addr_cu->abbrev_lock);
+ rwlock_init (result->fake_addr_cu->split_lock);
+ mutex_init (result->fake_addr_cu->src_lock