Contention on LWLock buffer_content, due to SHARED lock(?)
Hi, today I observed (on a r5.24xlarge AWS RDS instance, i.e. 96 logical cores) lock contention on a buffer content lock due to taking of a SHARED lock (I think): Three tables were involved, simplified case: CREATE TABLE global_config (id BIGINT PRIMARY KEY); CREATE TABLE b ( id BIGINT PRIMARY KEY, config_id BIGINT REFERENCES global_config (id) ); CREATE TABLE c ( id BIGINT PRIMARY KEY, config_id BIGINT REFERENCES global_config (id) ); (I suppose having both b + c doesn't make a difference, but maybe it was relevant, so I'm including it.) Heavy INSERT + UPDATE traffic on b + c (all trivial via id), SELECTs on global_config (again by id). As the name suggests, there were only very few rows in global_config, specifically only one was referenced by all INSERT + UPDATEs on b + c. On lighter load, all three types of queries were taking <1ms (as expected), as load grew, all three went to ~50ms avg. execution time together. AWS RDS console showed wait on LWLock:buffer_content as the main contribution to that time. Checking the code, I concluded that I observed lock contention on the lock taken at the beginning of heap_lock_tuple, where an exclusive buffer content lock is held while recording the SHARE lock into the tuple and the WAL and the multiXact. I don't know the actual number, but potentially up to 7000 active transactions were holding a SHARE lock on that row, which could have performance implications while scanning for multiXact memberships. Semantically, all that lock traffic was superfluous, as the global_config row's key was in no danger of being changed. As this situation (some global, essentially static, entity is referenced by a much written table) seems not uncommon, I wonder: 1. Does the above analysis sound about right? 2. If so, would it be worthwhile to develop a solution? I was thinking along the lines of introducing an multiXact representation of "everyone": Instead of meticulously recording every locking + completing transaciton in a multiXact, after a certain number of transactions has accumulated in a single multiXact, it is approximated as "everyone". If later a transaction finds that a SHARE lock is held by "everyone", the tuple would need no further modification (not sure if this could even be checked without taking an exclusive buffer lock). The hard part would probably be to ensure that an attempt to obtain an EXCLUSIVE lock would finally succeed against a SHARE lock held by "everyone". Regards, Drahflow signature.asc Description: PGP signature
Re: Contention on LWLock buffer_content, due to SHARED lock(?)
Hi, On Tue, Dec 10, 2019 at 08:44:17AM -0800, Andres Freund wrote: > > today I observed (on a r5.24xlarge AWS RDS instance, i.e. 96 logical > > cores) lock contention on a buffer content lock due to taking of a > > SHARED lock (I think): > When you say "7000 active transactions" - do you mean to say that you > have set max_connections to something higher than that, and you actually > have that many concurrent transactions? Yes, max connections was 2, active connections around 7000 at that time. Unfortunately, I don't have actual numbers of connections in transactions for that point in time. (We were trying to establish maximum performance of a larger system.) > > Semantically, all that lock traffic was superfluous, as the > > global_config row's key was in no danger of being changed. > Well, postgres can't know that. I am aware; it's just an argument for why it might be possible to shove some optimization there. > > 1. Does the above analysis sound about right? > Hard to know without additional data. What data would be worth recording next time? (Except number of active transactions, obviously.) > > 2. If so, would it be worthwhile to develop a solution? > Possible, but I'm not sure it's worth the complexity. > > I'd definitely like to see a proper reproducer and profile for this, > before investigating further. I'll see if and when I can include this into my client's project schedule. Might be a while, but I'll get back to you when I have a reproducer + profile data (of an up-to-date vanilla Postgres, not 10.7+AWS aurora patches). > I think we'd need a very convincing use-case for improvements around the > problem > you outline. Understood. I'll try to get an iron-clad profile of the problematic case first. Regards, Drahflow signature.asc Description: PGP signature
Re: Contention on LWLock buffer_content, due to SHARED lock(?)
Hi, On Tue, Dec 10, 2019 at 03:07:05PM -0300, Alvaro Herrera wrote: > I'd rather have the ability to mark a table READ ONLY (or similar). > Then any FK references can skip the row locks altogether. For the rare > cases where you need to modify the referenced table, have it marked READ > WRITE, and any row locks are registered normally from that point on, > until you set it back to READ ONLY again. However, that would require changes to applications writing to the table and a good understanding of performance characteristics by everyone trying to get to that scale. (OTOH, there is certainly an argument to be made that whoever hits this kind of problem better also has an idea of postgres performance tuning anyway.) More troubling (to me) is that I already know of another table in the system which should be next-in-line for the same problem, but only on some rows: It represents accounting entities, of which a very (nearly static) few are payment processors and all others are customers. From the application's perspective there's not too much difference between those, but any customer row will typically only be share locked once, whereas share locks on payment processor rows will be held by most of the transactions currently active. That use-case is not very uncommon I think, so it migth be worthwhile to implement a solution which does not require all rows of a table to share similar lock contention characteristics, or writability. Regards, Drahflow signature.asc Description: PGP signature
Parameter binding for COPY TO queries
Hi, I'd like some input on the idea of adding parameter binding support to queries executed as part of a COPY TO command. Is there a technical or philosophical reason why these queries should not contain bindable parameters? As far as I could ascertain, the feature has been desired for a long time (but only occasionally), see e.g.: * https://www.postgresql.org/message-id/flat/12415.1550157763%40sss.pgh.pa.us#682d53c41bda2d21b7cd4fba5000793c * https://postgrespro.com/list/thread-id/1893680 * https://stackoverflow.com/questions/69233792/how-to-pass-a-param-for-a-binding-in-postgresql-copy-to-stdout-format * https://stackoverflow.com/questions/44190514/pdo-postgresql-copy/44190617#44190617 * https://github.com/npgsql/npgsql/issues/1677 * https://github.com/jackc/pgx/issues/2247#issuecomment-2645836118 Currently, the protocol forces users of COPY TO to use SQL quoting, resulting in multiple client libraries not supporting parameters in COPY TO, resulting in various users doing unsafe interpolations (I come here from one such occasion). I might have some time to attempt an implementation, but would first like to know if there are reservations against it. Thanks, Drahflow signature.asc Description: PGP signature