On Wed, Oct 30, 2019 at 01:47:52PM +0100, Mark Wielaard wrote: > Hi Omar, > > On Mon, 2019-10-07 at 02:05 -0700, Omar Sandoval wrote: > > libdwfl has implementations of attaching to/detaching from threads > > and > > unwinding stack traces. However, that functionality is only available > > through the dwfl_thread_getframes interface, which isn't very flexible. > > This adds two new functions, dwfl_attach_thread and dwfl_detach_thread, > > which separate the thread stopping functionality out of > > dwfl_thread_getframes. Additionally, it makes dwfl_thread_getframes > > cache the stack trace for threads stopped this way. This makes it > > possible to use the frames after dwfl_thread_getframes returns. > > The advantage of the dwfl_thread_getframes interface is that you cannot > forget to "detach", so no Dwfl_Frames get dangling and (if the process > is life) you don't disrupt it more than necessary. This new interface > seems very simple to get wrong causing leaks and locking up > processes/threads.
That is true, although I imagine that use cases which don't need the additional flexibility would continue to use the simpler API. > Also, if we would adopt dwfl_attach_thread then I think it should take > a Dwfl_Thread object not a pid/tid as argument. In that case, we'd probably want to expose the internal getthread function with something like: /* Find the thread with the given thread ID. Returns zero if the thread was processed, -1 on error (including when no thread with the given thread ID exists), or the return value of the callback when not DWARF_CB_OK. */ int dwfl_getthread (Dwfl *dwfl, pid_t tid, int (*callback) (Dwfl_Thread *thread, void *arg), void *arg) __nonnull_attribute__ (1, 3); Then dwfl_attach_thread could be used with either dwfl_getthread or dwfl_getthreads, which is nice. However, a crucial part of this feature is being able to access the Dwfl_Thread outside of the callback. Since the Dwfl_Thread is currently on the stack, I see a couple of options: 1. We keep Dwfl_Thread on the stack and make dwfl_attach_thread() return a copy (like it does in this patch). 2. We always allocate the Dwfl_Thread on the heap and free it before returning from dwfl_getthread(s) _unless_ dwfl_attach_thread() was called. If it was, it will be freed by dwfl_detach_thread() instead. Option 2 sounds better to me. What do you think? > Could you provide some examples where this new interface/style is > beneficial? dwfl_attach_thread could be used to implement something like `gdb -p $pid`: attach to a thread, stop it, and poke at it further. dwfl_detach_thread would be kind of like GDB's `continue` command. I applied these patches to the version of elfutils that I bundle with drgn; you can see how I use the interface here [1]. Basically, drgn has an API to get the stack trace for a thread [2] and get the value of registers (and in the future, local variables) at each stack frame [3]. When I get the stack trace, I use dwfl_attach_thread and dwfl_thread_getframes. The user can then access the frames to their heart's content. When they're done with it, I free everything with dwfl_detach_thread. (The attach/detach is currently tied to the lifetime of the stack trace because I don't yet support actually pausing threads, but I plan to support that, hopefully using the same libdwfl attach/detach API.) Thanks, and sorry for the wall of text :) 1: https://github.com/osandov/drgn/blob/master/libdrgn/stack_trace.c 2: https://drgn.readthedocs.io/en/latest/api_reference.html#drgn.Program.stack_trace 3: https://drgn.readthedocs.io/en/latest/api_reference.html#stack-traces