[Added Andrey again in CC, because as I understand they are using this code or something like it in production. Please don't randomly remove people from CC lists.]
I've been looking at this some more, and I'm not confident in that the group clog update stuff is correct. I think the injection points test case was good enough to discover a problem, but it's hard to get peace of mind that there aren't other, more subtle problems. The problem I see is that the group update mechanism is designed around contention of the global xact-SLRU control lock; it uses atomics to coordinate a single queue when the lock is contended. So if we split up the global SLRU control lock using banks, then multiple processes using different bank locks might not contend. OK, this is fine, but what happens if two separate groups of processes encounter contention on two different bank locks? I think they will both try to update the same queue, and coordinate access to that *using different bank locks*. I don't see how can this work correctly. I suspect the first part of that algorithm, where atomics are used to create the list without a lock, might work fine. But will each "leader" process, each of which is potentially using a different bank lock, coordinate correctly? Maybe this works correctly because only one process will find the queue head not empty? If this is what happens, then there needs to be comments about it. Without any explanation, this seems broken and potentially dangerous, as some transaction commit bits might become lost given high enough concurrency and bad luck. Maybe this can be made to work by having one more lwlock that we use solely to coordinate this task. Though we would have to demonstrate that coordinating this task with a different lock works correctly in conjunction with the per-bank lwlock usage in the regular slru.c paths. Andrey, do you have any stress tests or anything else that you used to gain confidence in this code? -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "El sabio habla porque tiene algo que decir; el tonto, porque tiene que decir algo" (Platon).