Re: Commit Timestamp and LSN Inversion issue

Jan Wieck Tue, 12 Nov 2024 08:40:57 -0800

Hello,

On 11/12/24 10:34, Andres Freund wrote:

I have working code - pretty ugly at this state, but mostly needs a fair bit
of elbow grease not divine inspiration...  It's not a trivial change, but
entirely doable.


The short summary of how it works is that it uses a single 64bit atomic that
is internally subdivided into a ringbuffer position in N high bits and an
offset from a base LSN in the remaining bits.  The insertion sequence is

...

This leaves you with a single xadd to contended cacheline as the contention
point (scales far better than cmpxchg and far far better than
cmpxchg16b). There's a bit of contention for the ringbuffer[].oldpos being set
and read, but it's only by two backends, not all of them.


That sounds rather promising.

Would it be reasonable to have both implementations available at leastat compile time, if not at runtime? Is it possible that we need to dothat anyway for some time or are those atomic operations available onall supported CPU architectures?


The nice part is this scheme leaves you with a ringbuffer that's ordered by
the insertion-lsn. Which allows to make WaitXLogInsertionsToFinish() far more
efficient and to get rid of NUM_XLOGINSERT_LOCKS (by removing WAL insertion
locks). Right now NUM_XLOGINSERT_LOCKS is a major scalability limit - but at
the same time increasing it makes the contention on the spinlock *much* worse,
leading to slowdowns in other workloads.

Yeah, that is a complex wart that I believe was the answer to the NUMAoverload that Kevin Grittner and myself discovered many years ago, whereon a 4-socket machine the cacheline stealing would get so bad thatwhoever was holding the lock could not release it.

In any case, thanks for the input. Looks like in the long run we need tocome up with a different way to solve the inversion problem.



Best Regards, Jan

Re: Commit Timestamp and LSN Inversion issue

Reply via email to