Re: Yet another unwinding approach

Neal Gompa Mon, 16 Jan 2023 12:35:19 -0800

On Mon, Jan 16, 2023 at 3:30 PM Daniel Colascione <dan...@dancol.org> wrote:
>
> Frame pointers also have the disadvantage of working only with AOT-compiled 
> languages for which a trace analysis tool can associate an instruction 
> pointer with a semantically-relevant bit of code. If you try to use frame 
> pointers to profile a Python program, all you're going to get is a profile of 
> the interpreter. It seems like the debate is between those who want 
> observability (via frame pointers) and those who want the performance 
> benefits of -fomit-frame-pointer.
>
> There's a third way.
>
> See, both pro-FP and anti-FP camps think that it's the kernel that has to do 
> the unwinding unless we copy whole stacks into traces. Why should that be? As 
> mentioned in [1], instead of finding a way to have the kernel unwind user 
> programs, we can create a protocol through which the kernel can ask usermode 
> to unwind itself. It could work like this:
>
> 1) backtrace requested in the kernel (e.g. to a perf counter overflow)
>
> 2) kernel unwinds itself to the userspace boundary the usual way
>
> 3) kernel forms a nonce (e.g. by incrementing a 64-bit counter)
>
> 4) kernel logs a stack trace the usual way (e.g. to the ftrace ring buffer), 
> but with the final frame referring to the nonce created in the previous step
>
> 5) kernel queues a signal (one userspace has explicitly opted into via a new 
> prctl()); the siginfo_t structure encodes (e.g. via si_status and si_value) 
> the nonce
>
> 6) kernel eventually returns to userspace; queued signal handler gains control
>
> 7) signal handler unwinds the calling thread however it wants (and can sleep 
> and take page faults if needed)
>
> 8) signal handler logs the result of its unwind, along with the nonce, to the 
> system log (e.g. via a new system call, a sysfs write, an io_uring 
> submission, etc.)
>
> Post-processing tools can associate kernel stacks with user stacks tagged 
> with the corresponding nonces and reconstitute the full stacks in effect at 
> the time of each logged event.
>
> We can avoid duplicating unwindgs too: at step #3, if the kernel finds that 
> the current thread already has an unwind pending, it can uses the 
> already-pending nonce instead of making a new one and queuing a signal: many 
> kernel stacks can end with the same user stack "tail".
>
> One nice property of this scheme is that the userspace unwinding isn't 
> limited to native code. Libc could arbitrate unwinding across an arbitrary 
> number of managed runtime environments in the context of a single process: 
> the system could be smart enough to know that instead of unwinding through, 
> e.g. Python interpreter frames, the unwinder (which is normal userspace code, 
> pluggable via DSO!) could traverse and log *Python* stack frames instead, 
> with meaningful function names. And if you happened to have, say, a 
> JavaScript runtime in the same process, both JavaScript and Python could 
> participate in the semantic unwinding process.
>
> A pluggable userspace unwind mechanism would have zero cost in the case that 
> we're not recording stack frames. On top of that, a pluggable userspace 
> unwinder *could* be written to traverse frame pointers just as the kernel 
> unwinder does today, if userspace thinks that's the best option. Without 
> breaking kernel ABI, that userspace unwinder could use DWARF, or ORC, or any 
> other userspace unwinding approach. It's future-proof.
>
> In other words, choice between frame pointers and no frame pointers is a 
> false dichotomy. There's a better approach. The Linux ecosystem as a whole 
> would be better off building something like the pluggable userspace 
> asynchronous unwinding infrastructure described above.
>
> [1] 
> https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/646XXHGEGOKO465LQKWCPPPAZBSW5NWO/


This sounds great, but how is it going to get made? And is the kernel
amenable to this in the first place?



-- 
真実はいつも一つ！/ Always, there's only one truth!
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Re: Yet another unwinding approach

Reply via email to