On Wed, Feb 16, 2022 at 8:48 AM Masahiko Sawada <sawada.m...@gmail.com> wrote: > FYI, I've tested the situation that I assumed autovacuum can not > correct the problem; when the system had already crossed xidStopLimit, > it keeps failing to vacuum on tables that appear in the front of the > list and have sufficient garbage to trigger the truncation but are not > older than the failsafe limit. But contrary to my assumption, it did > correct the problem since autovacuum continues to the next table in > the list even after an error. This probably means that autovacuum > eventually succeeds to process all tables that trigger the failsafe > mode, ensuring advancing datfrozenxid, which is great.
Right; it seems as if the situation is much improved, even when the failsafe didn't prevent the system from going over xidStopLimit. If autovacuum alone can bring the system back to a normal state as soon as possible, without a human needing to do anything special, then clearly the general risk is much smaller. Even this worst case scenario where "the failsafe has failed" is not so bad anymore, in practice. I don't think that it really matters if some concurrent non-emergency VACUUMs fail when attempting to truncate the table (it's no worse than ANALYZE failing, for example). Good news! -- Peter Geoghegan