detaching from threads

Omar Sandoval Wed, 30 Oct 2019 16:50:32 -0700

On Wed, Oct 30, 2019 at 01:47:52PM +0100, Mark Wielaard wrote:
> Hi Omar,
> 
> On Mon, 2019-10-07 at 02:05 -0700, Omar Sandoval wrote:
> > libdwfl has implementations of attaching to/detaching from threads
> > and
> > unwinding stack traces. However, that functionality is only available
> > through the dwfl_thread_getframes interface, which isn't very flexible.
> > This adds two new functions, dwfl_attach_thread and dwfl_detach_thread,
> > which separate the thread stopping functionality out of
> > dwfl_thread_getframes. Additionally, it makes dwfl_thread_getframes
> > cache the stack trace for threads stopped this way. This makes it
> > possible to use the frames after dwfl_thread_getframes returns.
> 
> The advantage of the dwfl_thread_getframes interface is that you cannot
> forget to "detach", so no Dwfl_Frames get dangling and (if the process
> is life) you don't disrupt it more than necessary. This new interface
> seems very simple to get wrong causing leaks and locking up
> processes/threads.


That is true, although I imagine that use cases which don't need the
additional flexibility would continue to use the simpler API.

> Also, if we would adopt dwfl_attach_thread then I think it should take
> a Dwfl_Thread object not a pid/tid as argument.

In that case, we'd probably want to expose the internal getthread
function with something like:

/* Find the thread with the given thread ID.  Returns zero if the
   thread was processed, -1 on error (including when no thread with the
   given thread ID exists), or the return value of the callback when not
   DWARF_CB_OK.  */
int dwfl_getthread (Dwfl *dwfl, pid_t tid,
                    int (*callback) (Dwfl_Thread *thread, void *arg),
                    void *arg)
  __nonnull_attribute__ (1, 3);

Then dwfl_attach_thread could be used with either dwfl_getthread or
dwfl_getthreads, which is nice. However, a crucial part of this feature
is being able to access the Dwfl_Thread outside of the callback. Since
the Dwfl_Thread is currently on the stack, I see a couple of options:

1. We keep Dwfl_Thread on the stack and make dwfl_attach_thread() return
   a copy (like it does in this patch).
2. We always allocate the Dwfl_Thread on the heap and free it before
   returning from dwfl_getthread(s) _unless_ dwfl_attach_thread() was
   called. If it was, it will be freed by dwfl_detach_thread() instead.

Option 2 sounds better to me. What do you think?

> Could you provide some examples where this new interface/style is
> beneficial?

dwfl_attach_thread could be used to implement something like `gdb -p
$pid`: attach to a thread, stop it, and poke at it further.
dwfl_detach_thread would be kind of like GDB's `continue` command.

I applied these patches to the version of elfutils that I bundle with
drgn; you can see how I use the interface here [1]. Basically, drgn has
an API to get the stack trace for a thread [2] and get the value of
registers (and in the future, local variables) at each stack frame [3].
When I get the stack trace, I use dwfl_attach_thread and
dwfl_thread_getframes. The user can then access the frames to their
heart's content. When they're done with it, I free everything with
dwfl_detach_thread. (The attach/detach is currently tied to the lifetime
of the stack trace because I don't yet support actually pausing threads,
but I plan to support that, hopefully using the same libdwfl
attach/detach API.)

Thanks, and sorry for the wall of text :)

1: https://github.com/osandov/drgn/blob/master/libdrgn/stack_trace.c
2: 
https://drgn.readthedocs.io/en/latest/api_reference.html#drgn.Program.stack_trace
3: https://drgn.readthedocs.io/en/latest/api_reference.html#stack-traces

Re: [PATCH 3/5] libdwfl: add interface for attaching to/detaching from threads

Reply via email to