On Mon, Jan 16, 2023 at 3:30 PM Daniel Colascione <dan...@dancol.org> wrote: > > Frame pointers also have the disadvantage of working only with AOT-compiled > languages for which a trace analysis tool can associate an instruction > pointer with a semantically-relevant bit of code. If you try to use frame > pointers to profile a Python program, all you're going to get is a profile of > the interpreter. It seems like the debate is between those who want > observability (via frame pointers) and those who want the performance > benefits of -fomit-frame-pointer. > > There's a third way. > > See, both pro-FP and anti-FP camps think that it's the kernel that has to do > the unwinding unless we copy whole stacks into traces. Why should that be? As > mentioned in [1], instead of finding a way to have the kernel unwind user > programs, we can create a protocol through which the kernel can ask usermode > to unwind itself. It could work like this: > > 1) backtrace requested in the kernel (e.g. to a perf counter overflow) > > 2) kernel unwinds itself to the userspace boundary the usual way > > 3) kernel forms a nonce (e.g. by incrementing a 64-bit counter) > > 4) kernel logs a stack trace the usual way (e.g. to the ftrace ring buffer), > but with the final frame referring to the nonce created in the previous step > > 5) kernel queues a signal (one userspace has explicitly opted into via a new > prctl()); the siginfo_t structure encodes (e.g. via si_status and si_value) > the nonce > > 6) kernel eventually returns to userspace; queued signal handler gains control > > 7) signal handler unwinds the calling thread however it wants (and can sleep > and take page faults if needed) > > 8) signal handler logs the result of its unwind, along with the nonce, to the > system log (e.g. via a new system call, a sysfs write, an io_uring > submission, etc.) > > Post-processing tools can associate kernel stacks with user stacks tagged > with the corresponding nonces and reconstitute the full stacks in effect at > the time of each logged event. > > We can avoid duplicating unwindgs too: at step #3, if the kernel finds that > the current thread already has an unwind pending, it can uses the > already-pending nonce instead of making a new one and queuing a signal: many > kernel stacks can end with the same user stack "tail". > > One nice property of this scheme is that the userspace unwinding isn't > limited to native code. Libc could arbitrate unwinding across an arbitrary > number of managed runtime environments in the context of a single process: > the system could be smart enough to know that instead of unwinding through, > e.g. Python interpreter frames, the unwinder (which is normal userspace code, > pluggable via DSO!) could traverse and log *Python* stack frames instead, > with meaningful function names. And if you happened to have, say, a > JavaScript runtime in the same process, both JavaScript and Python could > participate in the semantic unwinding process. > > A pluggable userspace unwind mechanism would have zero cost in the case that > we're not recording stack frames. On top of that, a pluggable userspace > unwinder *could* be written to traverse frame pointers just as the kernel > unwinder does today, if userspace thinks that's the best option. Without > breaking kernel ABI, that userspace unwinder could use DWARF, or ORC, or any > other userspace unwinding approach. It's future-proof. > > In other words, choice between frame pointers and no frame pointers is a > false dichotomy. There's a better approach. The Linux ecosystem as a whole > would be better off building something like the pluggable userspace > asynchronous unwinding infrastructure described above. > > [1] > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/646XXHGEGOKO465LQKWCPPPAZBSW5NWO/
This sounds great, but how is it going to get made? And is the kernel amenable to this in the first place? -- 真実はいつも一つ!/ Always, there's only one truth! _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue