On Tue, 1 Dec 2020 at 11:31, Andres Freund <and...@anarazel.de> wrote: > > Hi, > > On 2020-11-30 13:35:46 +0800, Craig Ringer wrote: > > I find that when I most often want a backtrace of a running, live > > backend, it's because the backend is doing something that isn't > > passing a CHECK_FOR_INTERRUPTS() so it's not responding to signals. So > > it wouldn't help if a backend is waiting on an LWLock, busy in a > > blocking call to some loaded library, a blocking syscall, etc. But > > there are enough other times I want live backtraces, and I'm not the > > only one whose needs matter. > > Random thought: Wonder if it could be worth adding a conditionally > compiled mode where we track what the longest time between two > CHECK_FOR_INTERRUPTS() calls is (with some extra logic for client > IO). > > Obviously the regression tests don't tend to hit the worst cases of > CFR() less code, but even if they did, we currently wouldn't know from > running the regression tests.
We can probably determine that just as well with a perf or systemtap run on an --enable-dtrace build. Just tag CHECK_FOR_INTERRUPTS() with a SDT marker then record the timings. It might be convenient to have it built-in I guess, but if we tag the site and do the timing/tracing externally we don't have to bother about conditional compilation and special builds.