Greetings, * Konstantin Knizhnik (k.knizh...@postgrespro.ru) wrote: > It makes me think about two possible optimizations: > > 1. Provide separate invalidation messages for relation metadata and its > statistic. > So update of statistic should not invalidate relation cache. > The main problem with this proposal is that pg_class contains relpages and > reltuples columns which conceptually are \ part of relation statistic > but stored in relation cache. If relation statistic is updated, then most > likely this fields are also changed. So we have to remove this relation > from relation cache in any case.
I realize this is likely to go over like a lead balloon, but the churn in pg_class from updating reltuples/relpages has never seemed all that great to me when just about everything else is so rarely changed, and only through some user DDL action- and I agree that it seems like those particular columns are more 'statistics' type of info and less info about the definition of the relation. Other columns that do get changed regularly are relfrozenxid and relminmxid. I wonder if it's possible to move all of those elsewhere- perhaps some to the statistics tables as you seem to be alluding to, and the others to $somewhereelse that is dedicated to tracking that information which VACUUM is primarily concerned with. > 2. Remember in relation info XID of oldest active transaction at the moment > of last autovacuum. > At next autovacuum iteration we first of all compare this stored XID with > current oldest active transaction XID > and bypass vacuuming this relation if XID is not changed. > > Thoughts? That sounds like an interesting optimization and I agree it'd be nice to avoid the re-run of autovacuum when we can tell that there's not going to be anything more we can do. As noted above, for my part, I think it'd be nice to move that kind of ongoing maintenance/updates out of pg_class, but just in general I agree with the idea to store that info somewhere and wait until there's actually been progress in the global xmin before re-running a vacuum on a table. If we can do that somewhere outside of pg_class, I think that'd be better, but if no one is up for that kind of a shake-up, then maybe we just put it in pg_class and deal with the churn there. > >>So, that's really the core of your problem. We don't promise that > >>you can run several thousand backends at once. Usually it's recommended > >>that you stick a connection pooler in front of a server with (at most) > >>a few hundred backends. > >Sure, but that doesn't mean things should completely fall over when we > >do get up to larger numbers of backends, which is definitely pretty > >common in larger systems. I'm pretty sure we all agree that using a > >connection pooler is recommended, but if there's things we can do to > >make the system work at least a bit better when folks do use lots of > >connections, provided we don't materially damage other cases, that's > >probably worthwhile. > > I also think that Postgres performance should degrade gradually with > increasing number > of active backends. Actually further investigations of this particular case > shows that such large number of > database connections was caused by ... Postgres slowdown. > During normal workflow number of active backends is few hundreds. > But "invalidation storm" cause hangout of queries, so user application has > to initiate more and more new connections to perform required actions. > Yes, this may be not the best behavior of application in this case. At least > it should first terminate current connection using pg_terminate_backend. I > just want to notice that large number of backends was not the core of the > problem. Yeah, this is all getting back to the fact that we don't have an acceptance criteria or anything like that, where we'd actually hold off on new connections/queries being allowed in while other things are happening. Of course, a connection pooler would address this (and you could use one and have it still look exactly like PG, if you use, say, pgbouncer in session-pooling mode, but then you need to have the application drop/reconnect and not do its own connection pooling..), but it'd be nice to have something in core for this. > >Making them GUCs does seem like it's a few steps too far... but it'd be > >nice if we could arrange to have values that don't result in the system > >falling over with large numbers of backends and large numbers of tables. > >To get a lot of backends, you'd have to set max_connections up pretty > >high to begin with- perhaps we should contemplate allowing these values > >to vary based on what max_connections is set to? > > I think that optimal value of number of lock partitions should depend not on > number of connections > but on number of available CPU cores and so expected level on concurrency. > It is hard to propose some portable way to obtain this number. > This is why I think that GUCs is better solution. A GUC for 'number of CPUs' doesn't seem like a bad option to have. How to make that work well may be challenging though. > Certainly I realize that it is very dangerous parameter which should be > changed with special care. > Not only because of MAX_SIMUL_LWLOCKS. Sure. > There are few places in Postgres when it tries to lock all partitions > (deadlock detector, logical replication,...). > If there very thousands of partitions, then such lock will be too expensive > and we get yet another > popular Postgres program: "deadlock detection storm" when due to high > contention between backends lock can not be obtained > in deadlock timeout and so initiate deadlock detection. Simultaneous > deadlock detection performed by all backends > (which tries to take ALL partitions locks) paralyze the system (TPS falls > down to 0). > Proposed patch for this problem was also rejected (once again - problem can > be reproduced only of powerful server with large number of cores). That does sound like something that would be good to improve on, though I haven't looked at the proposed patch or read the associated thread, so I'm not sure I can really comment on its rejection specifically. Thanks, Stephen
signature.asc
Description: PGP signature