Re: remove adaptive spins_per_delay code

Andres Freund Tue, 27 Aug 2024 11:27:18 -0700

Hi,

On 2024-08-27 11:16:15 -0500, Nathan Bossart wrote:
> (creating new thread from [0])
> 
> On Wed, Apr 10, 2024 at 09:52:59PM -0400, Tom Lane wrote:
> > On fourth thought ... the number of tries to acquire the lock, or
> > in this case number of tries to observe the lock free, is not
> > NUM_DELAYS but NUM_DELAYS * spins_per_delay.  Decreasing
> > spins_per_delay should therefore increase the risk of unexpected
> > "stuck spinlock" failures.  And finish_spin_delay will decrement
> > spins_per_delay in any cycle where we slept at least once.
> > It's plausible therefore that this coding with finish_spin_delay
> > inside the main wait loop puts more downward pressure on
> > spins_per_delay than the algorithm is intended to cause.
> > 
> > I kind of wonder whether the premises finish_spin_delay is written
> > on even apply anymore, given that nobody except some buildfarm
> > dinosaurs runs Postgres on single-processor hardware anymore.
> > Maybe we should rip out the whole mechanism and hard-wire
> > spins_per_delay at 1000 or so.
> 
> I've been looking at spinlock contention on newer hardware, and while I do
> not yet have any proposal to share for that, I saw this adaptive
> spins_per_delay code and wondered about this possibility of "downward
> pressure on spins_per_delay" for contended locks.  ISTM it could make
> matters worse in some cases.
> 
> Anyway, I'm inclined to agree that the premise of the adaptive
> spins_per_delay code probably doesn't apply anymore, so here's a patch to
> remove it.


FWIW, I've seen cases on multi-socket machines where performance was vastly
worse under contention with some values of spins_per_delay. With good numbers
being quite different on smaller machines. Most new-ish server CPUs these days
basically behave like a multi-socket machine internally, due to being
internally partitioned into multiple chiplets. And it's pretty clear that that
trend isn't going to go away.  So finding a good value probably isn't easy.



We don't have a whole lot of contended spin.h spinlocks left, except that we
have one very critical one, XLogCtl->Insert.insertpos_lck. And of course we
use the same spinning logic for buffer header locks - which can be heavily
contended.

I suspect that eventually we ought to replace all our userspace
spinlock-like-things with a framework for writing properly "waiting" locks
with some spinning. We can't just use lwlocks because support for
reader-writer locks makes them much more heavyweight (mainly because it
implies having to use an atomic operation for lock release, which shows up
substantially).

Greetings,

Andres Freund

Re: remove adaptive spins_per_delay code

Reply via email to