On Dienstag, 10. Dezember 2024 22:42:25 Mitteleuropäische Normalzeit Serhei Makarov wrote: > This email sketches an 'unwinder cache' interface for libdwfl, derived from > recent eu-stacktrace code [1] and based on Christian Hergert's adaptation > of eu-stacktrace as sysprof-live-unwinder [2]. The intent is to remove the > need for a lot of boilerplate code that would be identical among profiling > tools that use libdwfl to unwind perf_events stack samples. But since this > becomes a library design, it's in need of feedback and bikeshedding.
Thank you all for this initiative, it's much appreciated. Once you have an API available, I would be willing to test it in the scope of perfparser/hotspot and report back how it performs compared to what we have build there over the years. Some more notes inline below. > [1]: > https://sourceware.org/cgit/elfutils/tree/README.eu-stacktrace?h=users/serh > ei/eu-stacktrace [2]: > https://gitlab.gnome.org/GNOME/sysprof/-/merge_requests/110 > > Here's my initial take on what to implement, keeping things as simple as > possible. I'll follow up with more emails if or when my understanding of > the best design evolves during further prototyping. > > (1) Suggested interface for a new struct Dwfl_Process_Table: > > Dwfl_Process_Table *dwfl_process_table_begin (const Dwfl_Callbacks > *callbacks); > Dwfl *dwfl_process_table_find_pid (Dwfl_Process_Table > *dwfltab, pid_t pid, bool create_if_missing); For symmetry reasons, we also need an `..._end` method. And what about explicitly removing processes once they have finished (PERF_RECORD_EXIT)? > Dwfl, although documented in the header as a "session with the library", > actually supports reporting/attaching one process at a time. Its real > meaning is therefore "elfutils data for working with a particular > Dwfl_Process". > > So, for global profiling, eu-stacktrace or the profiler end up creating a > table pid->Dwfl, with one Dwfl structure per process. I'd like elfutils to > provide this table, and in the implementation we could work on removing > redundant work between processes. Looking at Dwfl_Module, the relocations > are obviously specific to each process, but a lot of the Elf and Dwarf data > is potentially shareable between Dwfl instances. I agree, this is crucial to have. Esp. when you profile multiple short-lived applications, if you encounter compressed debug information the current API design would force you to uncompress the data once per process which is very bad. Great that you are taking this into account right from the start! > (2) Suggested interface for existing struct Dwfl: > > dwfl_perf_sample_getframes (dwfl, tid, > stack, stack_size, > regs, n_regs, abi, > perf_regs_mask, > int (*callback) > (Dwfl_Frame *state, void *arg), void *arg); > /* could also implement dwfl_thread_perf_sample_getframes(thread, ... args > as above ...) */ /* could package the args from stack .. perf_regs_mask > into a struct Dwfl_Stack_Sample? */ > > This implements the unwinding work that eu-stacktrace currently does on > behalf of the profiler. What is the scope of this API - "just" the normal stack unwinding? While useful, this will leave a lot of complicated problems to solve - most notably for symbolication and inline frame resolution. Should we add easier-to-use API for those too, while at it? > (3) Another thing the profiler needs to do is react to perf mmap events by > extending the Dwfl's representation of the process. Currently this is > implementable by calling dwfl_report_begin_add and imitating what > dwfl_linux_proc_maps_report does. The code to do this might be worth > rolling into libdwfl as a call along the lines of > > dwfl_report_proc_map (dwfl, filename, start, end, inode); > > though the exact details have to be worked out. I'm sure there's some > pitfalls here I haven't discovered yet. Correct handling of mmap events is very hairy in practice - at least in my experience. In perfparser we have quite some code (cf. PerfElfMap) that tries to translate the raw mmap events into a format that is consumable by elfutils. Sometimes the regions are seemingly overlapping e.g. I would also request that this API does not deal with raw `start`, `end` values - because that's not what you get. You get `addr`, `len`, `pgoff` and then you must figure out how to correctly assemble these three values. Cheers > === > > * Why not the existing eu-stacktrace fifo interface? > > In practice, the fifo solution proves to be unsatisfying in terms of > perceived performance -- extra hoops through which the data must jump > between processes -- and maintainability -- the Sysprof binary format needs > an extension for representing stack data, with nontrivial details to > bikeshed there. The bikeshedding effort may as well be spent on designing a > library interface to the eu-stacktrace functionality. > > * Why do we need a Dwfl_Process_Table struct independent of Dwfl? > > As mentioned above, putting it in the library gives us a place to implement > caching of data/work that's redundant between different Dwfls. There's a > different way of designing this that would involve having Dwfl be the > global session struct and replacing the single Dwfl::process field with a > table of processes, but that requires redoing the existing library > interface to be parametrized by Dwfl_Process instead of just Dwfl, grossly > disruptive relative to the level of functionality I'm actually adding. We > could also have Dwfls share data through a hidden global variable, but it > seems gross to enable such sneaky caching without the user of the library > explicitly requesting it via Dwfl_Process_Table. I agree with your proposal. But please take thread safety into account while creating your design. Right now, one could in theory at least process/unwind multiple processes in parallel. It would be great to keep this functionality even when dealing with multiple processes that share some common data behind the scenes! Thanks -- Milian Wolff m...@milianw.de http://milianw.de
signature.asc
Description: This is a digitally signed message part.