Frame pointers also have the disadvantage of working only with AOT-compiled 
languages for which a trace analysis tool can associate an instruction pointer 
with a semantically-relevant bit of code. If you try to use frame pointers to 
profile a Python program, all you're going to get is a profile of the 
interpreter. It seems like the debate is between those who want observability 
(via frame pointers) and those who want the performance benefits of 
-fomit-frame-pointer.

There's a third way.

See, both pro-FP and anti-FP camps think that it's the kernel that has to do 
the unwinding unless we copy whole stacks into traces. Why should that be? As 
mentioned in [1], instead of finding a way to have the kernel unwind user 
programs, we can create a protocol through which the kernel can ask usermode to 
unwind itself. It could work like this:

1) backtrace requested in the kernel (e.g. to a perf counter overflow)

2) kernel unwinds itself to the userspace boundary the usual way

3) kernel forms a nonce (e.g. by incrementing a 64-bit counter)

4) kernel logs a stack trace the usual way (e.g. to the ftrace ring buffer), 
but with the final frame referring to the nonce created in the previous step

5) kernel queues a signal (one userspace has explicitly opted into via a new 
prctl()); the siginfo_t structure encodes (e.g. via si_status and si_value) the 
nonce

6) kernel eventually returns to userspace; queued signal handler gains control

7) signal handler unwinds the calling thread however it wants (and can sleep 
and take page faults if needed)

8) signal handler logs the result of its unwind, along with the nonce, to the 
system log (e.g. via a new system call, a sysfs write, an io_uring submission, 
etc.) 

Post-processing tools can associate kernel stacks with user stacks tagged with 
the corresponding nonces and reconstitute the full stacks in effect at the time 
of each logged event. 

We can avoid duplicating unwindgs too: at step #3, if the kernel finds that the 
current thread already has an unwind pending, it can uses the already-pending 
nonce instead of making a new one and queuing a signal: many kernel stacks can 
end with the same user stack "tail".

One nice property of this scheme is that the userspace unwinding isn't limited 
to native code. Libc could arbitrate unwinding across an arbitrary number of 
managed runtime environments in the context of a single process: the system 
could be smart enough to know that instead of unwinding through, e.g. Python 
interpreter frames, the unwinder (which is normal userspace code, pluggable via 
DSO!) could traverse and log *Python* stack frames instead, with meaningful 
function names. And if you happened to have, say, a JavaScript runtime in the 
same process, both JavaScript and Python could participate in the semantic 
unwinding process.

A pluggable userspace unwind mechanism would have zero cost in the case that 
we're not recording stack frames. On top of that, a pluggable userspace 
unwinder *could* be written to traverse frame pointers just as the kernel 
unwinder does today, if userspace thinks that's the best option. Without 
breaking kernel ABI, that userspace unwinder could use DWARF, or ORC, or any 
other userspace unwinding approach. It's future-proof.

In other words, choice between frame pointers and no frame pointers is a false 
dichotomy. There's a better approach. The Linux ecosystem as a whole would be 
better off building something like the pluggable userspace asynchronous 
unwinding infrastructure described above.

[1] 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/646XXHGEGOKO465LQKWCPPPAZBSW5NWO/
 
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to