Hi, On 2025-03-07 09:03:18 -0800, Jacob Champion wrote: > On Fri, Mar 7, 2025 at 8:38 AM Andres Freund <and...@anarazel.de> wrote: > > FWIW, I continue to think that this is a misuse of wait events. We shouldn't > > use them as a poor man's general purpose tracing framework. > > Well, okay. That's frustrating.
I should have clarified - there are a few that I think are ok, basically the places where we wrap syscalls, e.g. around the sendto, select and recvfrom in PerformRadiusTransaction(). OTOH that code is effectively completely broken. Doing a blocking select() is just a no-go, the code isn't interruptible, breaking authentication timeout. And using select() means that we theoretically could crash due to an fd that's above FD_SETSIZE. Most of the other places I'm not on board with, that's wrapping large amounts of code in a wait event, which pretty much means we're not waiting. I think some of the wrapped calls into library code might actually call back into our code (to receive/send data), and our code then will use wait events around lower level operations done as part of that. Which pretty much explains my main issue with this - either the code can't wait in those function calls, in which case it's wrong to use wait events, or the code is flat out broken. It's also IMO quite wrong to do something that can throw an error inside a wait event, because that means that the wait event will still be reported during error recovery. Probably not the only place doing so, but it's still wrong. > If I return to the original design, but replace all of the high-level > wait events with calls to pgstat_report_activity(), does that work? It'd be less wrong. But I really doubt that it's a good idea to encode all kinds of function calls happening during authentication into something SQL visible. Why stop with these functions and not just do that for *all* functions in postgres? I mean it'd not work and slow everything down, but how do you define that line? Greetings, Andres Freund