Hi, FWIW, I don't see an advantage in 0003. If it allows us to make something else simpler / faster, cool, but on its own it doesn't seem worthwhile.
On 2022-12-02 16:31:58 -0800, Nathan Bossart wrote: > On Fri, Dec 02, 2022 at 04:32:38PM +0530, Bharath Rupireddy wrote: > > On Fri, Dec 2, 2022 at 6:10 AM Andres Freund <and...@anarazel.de> wrote: > >> I'm not sure this is quite right - don't we need a memory barrier. But I > >> don't > >> see a reason to not just leave this code as-is. I think this should be > >> optimized entirely in lwlock.c > > > > Actually, we don't need that at all as LWLockWaitForVar() will return > > immediately if the lock is free. So, I removed it. > > I briefly looked at the latest patch set, and I'm curious how this change > avoids introducing memory ordering bugs. Perhaps I am missing something > obvious. I'm a bit confused too - the comment above talks about LWLockWaitForVar(), but the patches seem to optimize LWLockUpdateVar(). I think it'd be safe to optimize LWLockConflictsWithVar(), due to some pre-existing, quite crufty, code. LWLockConflictsWithVar() says: * Test first to see if it the slot is free right now. * * XXX: the caller uses a spinlock before this, so we don't need a memory * barrier here as far as the current usage is concerned. But that might * not be safe in general. which happens to be true in the single, indirect, caller: /* Read the current insert position */ SpinLockAcquire(&Insert->insertpos_lck); bytepos = Insert->CurrBytePos; SpinLockRelease(&Insert->insertpos_lck); reservedUpto = XLogBytePosToEndRecPtr(bytepos); I think at the very least we ought to have a comment in WaitXLogInsertionsToFinish() highlighting this. It's not at all clear to me that the proposed fast-path for LWLockUpdateVar() is safe. I think at the very least we could end up missing waiters that we should have woken up. I think it ought to be safe to do something like pg_atomic_exchange_u64().. if (!(pg_atomic_read_u32(&lock->state) & LW_FLAG_HAS_WAITERS)) return; because the pg_atomic_exchange_u64() will provide the necessary memory barrier. Greetings, Andres Freund