On Tue, Jan 11, 2022 at 4:59 PM John Naylor <john.nay...@enterprisedb.com> wrote: > I've attached a PoC *untested* patch to show what it would look like > as a top-level statement. If the "shape" is uncontroversial, I'll put > work into testing it and fleshing it out.
Great! > For the PoC I wanted to try re-using existing keywords. I went with > "VACUUM LIMIT" since LIMIT is already a keyword that cannot be used as > a table name. It also brings "wraparound limit" to mind. We could add > a single-use unreserved keyword (such as VACUUM_MINIMAL or > VACUUM_FAST), but that doesn't seem great. This seems reasonable, but you could add a new option instead, without much downside. While INDEX_CLEANUP kind of looks like a keyword, it isn't really a keyword. (Perhaps you knew this already.) Making this a new option is a little awkward, admittedly. It's not clear what it means to "VACUUM (LIMIT) my_table" -- do you just throw an error for stuff like that? So perhaps your approach of adding VacuumMinimalStmt (a minimal variant of the VACUUM command) is better. > I'm not sure what the right trade-off is, but as written I used 95% of > max age. It might be undesirable to end up so close to kicking off > uninterruptible vacuums, but the point is to get out of single-user > mode and back to streaming WAL as quickly as possible. It might also > be worth overriding the min ages as well, but haven't done so here. I wonder if we should keep autovacuum_freeze_max_age out of it -- its default is too conservative in general. I'm concerned that applying this autovacuum_freeze_max_age test during VACUUM LIMIT doesn't go far enough -- it may require VACUUM LIMIT to do significantly more work than is needed to get the system back online (while leaving a sensible amount of headroom). Also seems like it might be a good idea to avoid relying on the user configuration, given that VACUUM LIMIT is only run when everything is already in disarray. (Besides, it's not clear that it's okay to use the autovacuum_freeze_max_age GUC without also using the reloption of the same name.) What do you think of applying a similar test using a generic 1 billion XID (and 1 billion MXID) age cutoff? When VACUUM LIMIT is run, we've already learned that the *entire* XID space wasn't sufficient for the user workload, so we're not really in a position to promise much. Often the real problem will be something like a leaked replication slot, or application code that's seriously misbehaving. It's really the DBA's job to *keep* the system up. VACUUM LIMIT is just a tool that allows the DBA to do this without excessive downtime. The GetNewTransactionId() WARNINGs ought to be changed to reference VACUUM LIMIT. (You probably just didn't get around to that in this POC, but couldn't hurt to remind you.) -- Peter Geoghegan