On Wed, Oct 30, 2024 at 09:44:14AM -0400, Mathieu Desnoyers wrote:
> What you want here is to move the point where you clear the task
> cookie to _after_ completion of stack unwind. There are a few ways
> this can be done:
> 
> A) Clear the task cookie in the task_work() right after the
>    unwind_user_deferred() is completed. Downside: if some long task work
>    happen to be done after the stack walk, a new unwind_user_deferred()
>    could be issued again and we may end up looping forever taking stack
>    unwind and never actually making forward progress.
> 
> B) Clear the task cookie after the exit_to_user_mode_loop is done,
>    before returning to user-space, while interrupts are disabled.

Problem is, if another tracer calls unwind_user_deferred() for the first
time, after the task work but before the task cookie gets cleared, it
will see the cookie is non-zero and will fail to schedule another task
work.  So its callback never gets called.

> > If I change the entry code to increment a per-task counter instead of a
> > per-cpu counter then this problem goes away.  I was just concerned about
> > the performance impacts of doing that on every entry.
> 
> Moving from per-cpu to per-task makes this cookie task-specific and not
> global anymore, I don't think we want this for a stack walking
> infrastructure meant to be used by various tracers. Also a global cookie
> is more robust and does not depend on guaranteeing that all the
> trace data is present to guarantee current thread ID accuracy and
> thus that cookies match between deferred unwind request and their
> fulfillment.

I don't disagree.  What I meant was, on entry (or exit), increment the
task cookie *with* the CPU bits included.

Or as you suggested previously, the "cookie" just be a struct with two
fields: CPU # and per-task entry counter.

-- 
Josh

Reply via email to