Hi, On 2021-12-09 15:28:18 -0400, John Naylor wrote: > When a user must shut down and restart in single-user mode to run > vacuum on an entire database, that does a lot of work that's > unnecessary for getting the system online again, even without > index_cleanup. We had a recent case where a single-user vacuum took > around 3 days to complete. > > Now that we have a concept of a fail-safe vacuum, maybe it would be > beneficial to skip a vacuum in single-user mode if the fail-safe > criteria were not met at the beginning of vacuuming a relation. This > is not without risk, of course, but it should be much faster than > today and once up and running the admin would have a chance to get a > handle on things. Thoughts?
What if the user tried to reclaim space by vacuuming (via truncation)? Or is working around some corruption or such? I think this is too much magic. That said, having a VACUUM "selector" that selects the oldest tables could be quite useful. And address this usecase both for single-user and normal operation. Another thing that might be worth doing is to update relfrozenxid earlier. We definitely should update it before doing truncation (that can be quite expensive). But we probably should do it even before the final lazy_cleanup_all_indexes() pass - often that'll be the only pass, and there's really no reason to delay relfrozenxid advancement till after that. Greetings, Andres Freund