Hi, On 2019-12-09 23:10:36 +0100, Jens-Wolfhard Schicke-Uffmann wrote: > today I observed (on a r5.24xlarge AWS RDS instance, i.e. 96 logical > cores) lock contention on a buffer content lock due to taking of a > SHARED lock (I think):
> Three tables were involved, simplified case: > > CREATE TABLE global_config (id BIGINT PRIMARY KEY); > > CREATE TABLE b ( > id BIGINT PRIMARY KEY, > config_id BIGINT REFERENCES global_config (id) > ); > > CREATE TABLE c ( > id BIGINT PRIMARY KEY, > config_id BIGINT REFERENCES global_config (id) > ); > > (I suppose having both b + c doesn't make a difference, but > maybe it was relevant, so I'm including it.) > > Heavy INSERT + UPDATE traffic on b + c (all trivial via id), > SELECTs on global_config (again by id). > As the name suggests, there were only very few rows in > global_config, specifically only one was referenced by all > INSERT + UPDATEs on b + c. > > On lighter load, all three types of queries were taking <1ms (as > expected), as load grew, all three went to ~50ms avg. execution time > together. AWS RDS console showed wait on LWLock:buffer_content as the > main contribution to that time. > > Checking the code, I concluded that I observed lock contention > on the lock taken at the beginning of heap_lock_tuple, where > an exclusive buffer content lock is held while recording the > SHARE lock into the tuple and the WAL and the multiXact. I don't know > the actual number, but potentially up to 7000 active > transactions were holding a SHARE lock on that row, which could have > performance implications while scanning for multiXact memberships. When you say "7000 active transactions" - do you mean to say that you have set max_connections to something higher than that, and you actually have that many concurrent transactions? > Semantically, all that lock traffic was superfluous, as the > global_config row's key was in no danger of being changed. Well, postgres can't know that. > As this situation (some global, essentially static, entity is referenced > by a much written table) seems not uncommon, I wonder: > > 1. Does the above analysis sound about right? Hard to know without additional data. > 2. If so, would it be worthwhile to develop a solution? Possible, but I'm not sure it's worth the complexity. I'd definitely like to see a proper reproducer and profile for this, before investigating further. > I was thinking along the lines of introducing an multiXact > representation of "everyone": Instead of meticulously recording every > locking + completing transaciton in a multiXact, after a certain > number of transactions has accumulated in a single multiXact, it is > approximated as "everyone". If later a transaction finds that a SHARE > lock is held by "everyone", the tuple would need no further modification I think the big problem with a strategy like this is that it's prone to generate deadlocks that aren't present in the "original" scheduling. > (not sure if this could even be checked without taking an exclusive > buffer lock). It should only require a share lock. > The hard part would probably be to ensure that an > attempt to obtain an EXCLUSIVE lock would finally succeed against a > SHARE lock held by "everyone". Note that this is a seriously complicated area of the code. It's very easy to create new bugs that aren't easily testable. I think we'd need a very convincing use-case for improvements around the problem you outline and relatively simple solution, to counter stability concerns. Greetings, Andres Freund