On Wed, May 10, 2023 at 5:34 PM Michael Paquier <mich...@paquier.xyz> wrote: > > + * NB: LWLockConflictsWithVar (which is called from > + * LWLockWaitForVar) relies on the spinlock used above in this > + * function and doesn't use a memory barrier. > > This patch adds the following comment in WaitXLogInsertionsToFinish() > because lwlock.c on HEAD mentions that: > /* > * Test first to see if it the slot is free right now. > * > * XXX: the caller uses a spinlock before this, so we don't need a memory > * barrier here as far as the current usage is concerned. But that might > * not be safe in general. > */ > > Should it be something where we'd better be noisy about at the top of > LWLockWaitForVar()? We don't want to add a memory barrier at the > beginning of LWLockConflictsWithVar(), still it strikes me that > somebody that aims at using LWLockWaitForVar() may miss this point > because LWLockWaitForVar() is the routine published in lwlock.h, not > LWLockConflictsWithVar(). This does not need to be really > complicated, say a note at the top of LWLockWaitForVar() among the > lines of (?): > "Be careful that LWLockConflictsWithVar() does not include a memory > barrier, hence the caller of this function may want to rely on an > explicit barrier or a spinlock to avoid memory ordering issues."
+1. Now, we have comments in 3 places to warn about the LWLockConflictsWithVar not using memory barrier - one in WaitXLogInsertionsToFinish, one in LWLockWaitForVar and another one (existing) in LWLockConflictsWithVar specifying where exactly a memory barrier is needed if the caller doesn't use a spinlock. Looks fine to me. > + * NB: pg_atomic_exchange_u64 is used here as opposed to just > + * pg_atomic_write_u64 to update the variable. Since > pg_atomic_exchange_u64 > + * is a full barrier, we're guaranteed that all loads and stores issued > + * prior to setting the variable are completed before any loads or stores > + * issued after setting the variable. > > This is the same explanation as LWLockUpdateVar(), except that we > lose the details explaining why we are doing the update before > releasing the lock. I think what I have so far seems more verbose explaining what a barrier does and all that. I honestly think we don't need to be that verbose, thanks to README.barrier. I simplified those 2 comments as the following: * NB: pg_atomic_exchange_u64, having full barrier semantics will ensure * the variable is updated before releasing the lock. * NB: pg_atomic_exchange_u64, having full barrier semantics will ensure * the variable is updated before waking up waiters. Please find the attached v7 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
From 8e4eeccadc106381bc2898c1887109f96c3db869 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Thu, 18 May 2023 05:17:05 +0000 Subject: [PATCH v7] Optimize WAL insertion lock acquisition and release This commit optimizes WAL insertion lock acquisition and release in the following way: 1. WAL insertion lock's variable insertingAt is currently read and written with the help of lwlock's wait list lock to avoid torn-free reads/writes. This wait list lock can become a point of contention on a highly concurrent write workloads. Therefore, make insertingAt a 64-bit atomic which inherently provides torn-free reads/writes. 2. LWLockUpdateVar currently acquires lwlock's wait list lock even when there are no waiters at all. Add a fastpath exit to LWLockUpdateVar when there are no waiters to avoid unnecessary locking. Note that atomic exchange operation (which is a full barrier) is used when necessary, instead of atomic write to ensure the memory ordering is preserved. It also adds notes on why LWLockConflictsWithVar doesn't need a memory barrier as far as its current usage is concerned. Suggested-by: Andres Freund Author: Bharath Rupireddy Reviewed-by: Nathan Bossart Reviewed-by: Michael Paquier Discussion: https://www.postgresql.org/message-id/20221124184619.xit4sfi52bcz2tva%40awork3.anarazel.de --- src/backend/access/transam/xlog.c | 8 +++- src/backend/storage/lmgr/lwlock.c | 64 +++++++++++++++++++------------ src/include/storage/lwlock.h | 6 +-- 3 files changed, 48 insertions(+), 30 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index bc5a8e0569..92b0b87d1e 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -376,7 +376,7 @@ typedef struct XLogwrtResult typedef struct { LWLock lock; - XLogRecPtr insertingAt; + pg_atomic_uint64 insertingAt; XLogRecPtr lastImportantAt; } WALInsertLock; @@ -1495,6 +1495,10 @@ WaitXLogInsertionsToFinish(XLogRecPtr upto) * calling LWLockUpdateVar. But if it has to sleep, it will * advertise the insertion point with LWLockUpdateVar before * sleeping. + * + * NB: LWLockConflictsWithVar (which is called from + * LWLockWaitForVar) relies on the spinlock used above in this + * function and doesn't use a memory barrier. */ if (LWLockWaitForVar(&WALInsertLocks[i].l.lock, &WALInsertLocks[i].l.insertingAt, @@ -4611,7 +4615,7 @@ XLOGShmemInit(void) for (i = 0; i < NUM_XLOGINSERT_LOCKS; i++) { LWLockInitialize(&WALInsertLocks[i].l.lock, LWTRANCHE_WAL_INSERT); - WALInsertLocks[i].l.insertingAt = InvalidXLogRecPtr; + pg_atomic_init_u64(&WALInsertLocks[i].l.insertingAt, InvalidXLogRecPtr); WALInsertLocks[i].l.lastImportantAt = InvalidXLogRecPtr; } diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c index 59347ab951..65a7f64314 100644 --- a/src/backend/storage/lmgr/lwlock.c +++ b/src/backend/storage/lmgr/lwlock.c @@ -1547,9 +1547,8 @@ LWLockAcquireOrWait(LWLock *lock, LWLockMode mode) * *result is set to true if the lock was free, and false otherwise. */ static bool -LWLockConflictsWithVar(LWLock *lock, - uint64 *valptr, uint64 oldval, uint64 *newval, - bool *result) +LWLockConflictsWithVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 oldval, + uint64 *newval, bool *result) { bool mustwait; uint64 value; @@ -1572,13 +1571,11 @@ LWLockConflictsWithVar(LWLock *lock, *result = false; /* - * Read value using the lwlock's wait list lock, as we can't generally - * rely on atomic 64 bit reads/stores. TODO: On platforms with a way to - * do atomic 64 bit reads/writes the spinlock should be optimized away. + * Reading the value atomically ensures that we don't need any explicit + * locking. Note that in general, 64 bit atomic APIs in postgres inherently + * provide explicit locking for the platforms without atomics support. */ - LWLockWaitListLock(lock); - value = *valptr; - LWLockWaitListUnlock(lock); + value = pg_atomic_read_u64(valptr); if (value != oldval) { @@ -1605,9 +1602,14 @@ LWLockConflictsWithVar(LWLock *lock, * * Note: this function ignores shared lock holders; if the lock is held * in shared mode, returns 'true'. + * + * Be careful that LWLockConflictsWithVar() does not include a memory barrier, + * hence the caller of this function may want to rely on an explicit barrier or + * a spinlock to avoid memory ordering issues. */ bool -LWLockWaitForVar(LWLock *lock, uint64 *valptr, uint64 oldval, uint64 *newval) +LWLockWaitForVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 oldval, + uint64 *newval) { PGPROC *proc = MyProc; int extraWaits = 0; @@ -1735,29 +1737,43 @@ LWLockWaitForVar(LWLock *lock, uint64 *valptr, uint64 oldval, uint64 *newval) * LWLockUpdateVar - Update a variable and wake up waiters atomically * * Sets *valptr to 'val', and wakes up all processes waiting for us with - * LWLockWaitForVar(). Setting the value and waking up the processes happen - * atomically so that any process calling LWLockWaitForVar() on the same lock - * is guaranteed to see the new value, and act accordingly. + * LWLockWaitForVar(). It first sets the value atomically and then wakes up + * the waiting processes so that any process calling LWLockWaitForVar() on the + * same lock is guaranteed to see the new value, and act accordingly. * * The caller must be holding the lock in exclusive mode. */ void -LWLockUpdateVar(LWLock *lock, uint64 *valptr, uint64 val) +LWLockUpdateVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 val) { proclist_head wakeup; proclist_mutable_iter iter; PRINT_LWDEBUG("LWLockUpdateVar", lock, LW_EXCLUSIVE); + /* + * Update the variable atomically first without having to acquire wait + * list lock, so that if anyone looking for the lock will have chance to + * grab it a bit quickly. + * + * NB: pg_atomic_exchange_u64, having full barrier semantics will ensure + * the variable is updated before waking up waiters. + */ + pg_atomic_exchange_u64(valptr, val); + + /* + * Quick exit when there are no waiters. This avoids unnecessary lwlock's + * wait list lock acquisition and release. + */ + if ((pg_atomic_read_u32(&lock->state) & LW_FLAG_HAS_WAITERS) == 0) + return; + proclist_init(&wakeup); LWLockWaitListLock(lock); Assert(pg_atomic_read_u32(&lock->state) & LW_VAL_EXCLUSIVE); - /* Update the lock's value */ - *valptr = val; - /* * See if there are any LW_WAIT_UNTIL_FREE waiters that need to be woken * up. They are always in the front of the queue. @@ -1873,17 +1889,15 @@ LWLockRelease(LWLock *lock) * LWLockReleaseClearVar - release a previously acquired lock, reset variable */ void -LWLockReleaseClearVar(LWLock *lock, uint64 *valptr, uint64 val) +LWLockReleaseClearVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 val) { - LWLockWaitListLock(lock); - /* - * Set the variable's value before releasing the lock, that prevents race - * a race condition wherein a new locker acquires the lock, but hasn't yet - * set the variables value. + * Update the variable atomically first. + * + * NB: pg_atomic_exchange_u64, having full barrier semantics will ensure + * the variable is updated before releasing the lock. */ - *valptr = val; - LWLockWaitListUnlock(lock); + pg_atomic_exchange_u64(valptr, val); LWLockRelease(lock); } diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h index d2c7afb8f4..f19bc49193 100644 --- a/src/include/storage/lwlock.h +++ b/src/include/storage/lwlock.h @@ -128,14 +128,14 @@ extern bool LWLockAcquire(LWLock *lock, LWLockMode mode); extern bool LWLockConditionalAcquire(LWLock *lock, LWLockMode mode); extern bool LWLockAcquireOrWait(LWLock *lock, LWLockMode mode); extern void LWLockRelease(LWLock *lock); -extern void LWLockReleaseClearVar(LWLock *lock, uint64 *valptr, uint64 val); +extern void LWLockReleaseClearVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 val); extern void LWLockReleaseAll(void); extern bool LWLockHeldByMe(LWLock *lock); extern bool LWLockAnyHeldByMe(LWLock *lock, int nlocks, size_t stride); extern bool LWLockHeldByMeInMode(LWLock *lock, LWLockMode mode); -extern bool LWLockWaitForVar(LWLock *lock, uint64 *valptr, uint64 oldval, uint64 *newval); -extern void LWLockUpdateVar(LWLock *lock, uint64 *valptr, uint64 val); +extern bool LWLockWaitForVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 oldval, uint64 *newval); +extern void LWLockUpdateVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 val); extern Size LWLockShmemSize(void); extern void CreateLWLocks(void); -- 2.34.1