Hi, On 2024-11-12 11:40:39 -0500, Jan Wieck wrote: > On 11/12/24 10:34, Andres Freund wrote: > > I have working code - pretty ugly at this state, but mostly needs a fair bit > > of elbow grease not divine inspiration... It's not a trivial change, but > > entirely doable. > > > > The short summary of how it works is that it uses a single 64bit atomic that > > is internally subdivided into a ringbuffer position in N high bits and an > > offset from a base LSN in the remaining bits. The insertion sequence is > > > > ... > > > > This leaves you with a single xadd to contended cacheline as the contention > > point (scales far better than cmpxchg and far far better than > > cmpxchg16b). There's a bit of contention for the ringbuffer[].oldpos being > > set > > and read, but it's only by two backends, not all of them. > > That sounds rather promising. > > Would it be reasonable to have both implementations available at least at > compile time, if not at runtime?
No, not reasonably. > Is it possible that we need to do that anyway for some time or are those > atomic operations available on all supported CPU architectures? We have a fallback atomics implementation for the uncommon architectures without 64bit atomics. > In any case, thanks for the input. Looks like in the long run we need to > come up with a different way to solve the inversion problem. IMO there's absolutely no way the changes proposed in this thread so far should get merged. Greetings, Andres Freund