Andres Freund <and...@anarazel.de> writes: > On 2024-04-11 15:24:28 -0400, Robert Haas wrote: >> Or, rip out the whole, whole mechanism and just don't PANIC.
> I continue believe that that'd be a quite bad idea. I'm warming to it myself. > My suspicion is that most of the false positives are caused by lots of signals > interrupting the pg_usleep()s. Because we measure the number of delays, not > the actual time since we've been waiting for the spinlock, signals > interrupting pg_usleep() trigger can very significantly shorten the amount of > time until we consider a spinlock stuck. We should fix that. We wouldn't need to fix it, if we simply removed the NUM_DELAYS limit. Whatever kicked us off the sleep doesn't matter, we might as well go check the spinlock. Also, you propose in your other message replacing spinlocks with lwlocks. Whatever the other merits of that, I notice that we have no timeout or "stuck lwlock" detection. So that would basically remove the stuck-spinlock behavior in an indirect way, without adding any safety measures that would justify thinking that it's less likely we needed stuck-lock detection than before. regards, tom lane