On Wed, Dec 11, 2024, at 6:24 AM, Milian Wolff wrote:
> On Dienstag, 10. Dezember 2024 22:42:25 Mitteleuropäische Normalzeit Serhei
> Makarov wrote:
>> This email sketches an 'unwinder cache' interface for libdwfl, derived from
>> recent eu-stacktrace code [1] and based on Christian Hergert's adaptation
>> of eu-stacktrace as sysprof-live-unwinder [2]. The intent is to remove the
>> need for a lot of boilerplate code that would be identical among profiling
>> tools that use libdwfl to unwind perf_events stack samples. But since this
>> becomes a library design, it's in need of feedback and bikeshedding.
>
> Thank you all for this initiative, it's much appreciated. Once you have an
> API
> available, I would be willing to test it in the scope of perfparser/hotspot
> and report back how it performs compared to what we have build there over the
> years. Some more notes inline below.
Thank you for the feedback! I'm very relieved to hear there's interest in
having this API.
>> Dwfl_Process_Table *dwfl_process_table_begin (const Dwfl_Callbacks
>> *callbacks);
>> Dwfl *dwfl_process_table_find_pid (Dwfl_Process_Table
>> *dwfltab, pid_t pid, bool create_if_missing);
>
> For symmetry reasons, we also need an `..._end` method. And what about
> explicitly removing processes once they have finished (PERF_RECORD_EXIT)?
Agreed re: providing those methods. As discussed elsewhere in this thread,
there is some room for experimentation as to when exactly processes should be
removed, but that's a decision made above the level of this particular api.
> I agree, this is crucial to have. Esp. when you profile multiple short-lived
> applications, if you encounter compressed debug information the current API
> design would force you to uncompress the data once per process which is very
> bad. Great that you are taking this into account right from the start!
Yep.
> What is the scope of this API - "just" the normal stack unwinding? While
> useful, this will leave a lot of complicated problems to solve - most notably
> for symbolication and inline frame resolution. Should we add easier-to-use
> API
> for those too, while at it?
I think imitating the extent to which elfutils src/stack.c supports this is
warranted for the initial implementation. I'll have to see for my 2nd iteration
how much actual new api is needed for that, though. Beyond that there are
additional goals e.g. to attain parity with Sysprof's symbolication for
containerized code.
>> (3) Another thing the profiler needs to do is react to perf mmap events by
>> extending the Dwfl's representation of the process. Currently this is
>> implementable by calling dwfl_report_begin_add and imitating what
>> dwfl_linux_proc_maps_report does. The code to do this might be worth
>> rolling into libdwfl as a call along the lines of
>>
>> dwfl_report_proc_map (dwfl, filename, start, end, inode);
>>
>> though the exact details have to be worked out. I'm sure there's some
>> pitfalls here I haven't discovered yet.
>
> Correct handling of mmap events is very hairy in practice - at least in my
> experience. In perfparser we have quite some code (cf. PerfElfMap) that tries
> to translate the raw mmap events into a format that is consumable by
> elfutils.
> Sometimes the regions are seemingly overlapping e.g.
>
> I would also request that this API does not deal with raw `start`, `end`
> values - because that's not what you get. You get `addr`, `len`, `pgoff` and
> then you must figure out how to correctly assemble these three values.
Ok, that seems feasible as an additional function wrapping the start...end one,
perhaps dwfl_report_perf_mmap (...).
In general I've found man perf_event_open(2) to be low on explanatory detail
and the actual documentation is the kernel source code :-)
I'll need to clearly distinguish the functions accepting data in the format
provided by perf -- this part, and the registers array are otherwise a pitfall.
> I agree with your proposal. But please take thread safety into account while
> creating your design. Right now, one could in theory at least process/unwind
> multiple processes in parallel. It would be great to keep this functionality
> even when dealing with multiple processes that share some common data behind
> the scenes!
Ok. There has been some work on elfutils thread-safety, so I will need to make
sure to study and harmonize with that.