Hi, On 2024-04-15 10:54:16 -0400, Robert Haas wrote: > On Fri, Apr 12, 2024 at 3:33 PM Andres Freund <and...@anarazel.de> wrote: > > Here's a patch implementing this approach. I confirmed that before we > > trigger > > the stuck spinlock logic very quickly and after we don't. However, if most > > sleeps are interrupted, it can delay the stuck spinlock detection a good > > bit. But that seems much better than triggering it too quickly. > > +1 for doing something about this. I'm not sure if it goes far enough, > but it definitely seems much better than doing nothing.
One thing I started to be worried about is whether a patch ought to prevent the timeout used by perform_spin_delay() from increasing when interrupted. Otherwise a few signals can trigger quite long waits. But as a I can't quite see a way to make this accurate in the backbranches, I suspect something like what I posted is still a good first version. > Given your findings, I'm honestly kind of surprised that I haven't seen > problems of this type more frequently. Same. I did a bunch of searches for the error, but found surprisingly little. I think in practice most spinlocks just aren't contended enough to reach perform_spin_delay(). And we have improved some on that over time. Greetings, Andres Freund