(moving discussion to -hackers, for patch-review) On Mon, Dec 16, 2024 at 11:30 PM Laurenz Albe <laurenz.a...@cybertec.at> wrote: > > On Mon, 2024-12-16 at 16:25 -0800, Will Storey wrote: > > I would like to disable vacuum's truncate behaviour for autovacuum. > > Previously I had an outage due to its access exclusive lock when it was > > replicated to a hot standby.
I can attest to one production outage caused by this behaviour of autovacuum. The truncate operation performed by autovacuum, when being replayed on the replica, was blocked by a query. Any new queries on that relation were blocked by replication. > > When that outage happened it was from a VACUUM call in a cronjob rather > > than autovacuum. I now run such VACUUMs with TRUNCATE false which avoids > > the issue for these. However I've realized that autovacuum could cause this > > as well. > > > > I believe the only way to disable this for autovacuum is by changing the > > vacuum_truncate storage parameters on tables. (Ignoring the now removed > > old_snapshot_threshold option). > > Yes, you can only do that table by table. That is unfortunate. Although doing so provides a granular control over which relations one would like to exclude from truncation, it may not always be desirable; a DBA/sysadmin may want to prevent this problem system-wide. Also, this not really scalable since it requires that a maintenance operation regularly connect to every database and apply this setting to all the relations, for the fear that there may be new objects somewhere in the cluster since last maintenance, which may cause this problem. It would be error prone, too, considering that the list of databases in a cluster may change over time. And then there's the added burden of monitoring the status of this maintenance operation to ensure it's running successfully every time. Turning on a system-wide setting that disables autovacuum truncation may look like a heavy hammer, but in certain situations this may be preferable to the risk of causing outage in production systems. It may be preferable to let the system consume disk space by not truncating the tables, as opposed to running the risk of blocked queries. Disk is cheap, and is possibly already being monitored in a production system. I understand Jeremy's contention upthread against adding such a feature at global level, but I'm in favor of adding this feature since it prevents a sudden and unpredictable impact on production systems, and instead leads to a gradual escalation of the problem that can be monitored and addressed by a sysadmin/DBA at a time that's convenient for them. > > I am also wondering if having an autovacuum setting to control it would be > > a good idea for a feature. > > I'm all for that. Please see attached an initial patch to disable truncation behaviour in autovacuum. This patch retains the default behavior of autovacuum truncating relations. The user is allowed to change the behaviour and disable relation truncations system-wide by setting autovacuum_disable_vacuum_truncate = true. Better parameter names welcome :-) One additional improvement I can think of is to emit a WARNING or NOTICE message that truncate operation is being skipped, perhaps only if the truncation would've freed up space over a certain threshold. Perhaps there's value in letting this parameter be specified at database level, but I'm not able to think of a reason why someone would want to disable this behaviour on just one database. So leaving the parameter context to be the same as most other autovacuum parameters: SIGHUP. Best regards, Gurjeet http://Gurje.et
commit 84e8eebb87bc2c58feae847efd995bc055701688 Author: Gurjeet Singh <gurjeet@singh.im> Date: Thu Jan 23 19:37:33 2025 -0800 Version 1 diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 0ab921a169..17e22c08e5 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -115,6 +115,7 @@ * GUC parameters */ bool autovacuum_start_daemon = false; +bool autovacuum_disable_vacuum_truncate = false; int autovacuum_worker_slots; int autovacuum_max_workers; int autovacuum_work_mem = -1; @@ -2811,12 +2812,16 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map, (!wraparound ? VACOPT_SKIP_LOCKED : 0); /* - * index_cleanup and truncate are unspecified at first in autovacuum. - * They will be filled in with usable values using their reloptions - * (or reloption defaults) later. + * index_cleanup is unspecified at first in autovacuum. truncate is + * unspecified, unless it is disabled via the GUC parameter. + * + * The unspecified options will be filled in with usable values using + * their reloptions (or reloption defaults) later. */ tab->at_params.index_cleanup = VACOPTVALUE_UNSPECIFIED; - tab->at_params.truncate = VACOPTVALUE_UNSPECIFIED; + tab->at_params.truncate = autovacuum_disable_vacuum_truncate + ? VACOPTVALUE_DISABLED + : VACOPTVALUE_UNSPECIFIED; /* As of now, we don't support parallel vacuum for autovacuum */ tab->at_params.nworkers = -1; tab->at_params.freeze_min_age = freeze_min_age; diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c index 38cb9e970d..40368e339c 100644 --- a/src/backend/utils/misc/guc_tables.c +++ b/src/backend/utils/misc/guc_tables.c @@ -1509,6 +1509,16 @@ struct config_bool ConfigureNamesBool[] = NULL, NULL, NULL }, + { + {"autovacuum_disable_vacuum_truncate", PGC_SIGHUP, VACUUM_AUTOVACUUM, + gettext_noop("Disables autovacuum behavior of truncatiing relations."), + NULL + }, + &autovacuum_disable_vacuum_truncate, + false, + NULL, NULL, NULL + }, + { {"trace_notify", PGC_USERSET, DEVELOPER_OPTIONS, gettext_noop("Generates debugging output for LISTEN and NOTIFY."), diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h index 54e01c81d6..c7597407d2 100644 --- a/src/include/postmaster/autovacuum.h +++ b/src/include/postmaster/autovacuum.h @@ -28,6 +28,7 @@ typedef enum /* GUC variables */ extern PGDLLIMPORT bool autovacuum_start_daemon; +extern PGDLLIMPORT bool autovacuum_disable_vacuum_truncate; extern PGDLLIMPORT int autovacuum_worker_slots; extern PGDLLIMPORT int autovacuum_max_workers; extern PGDLLIMPORT int autovacuum_work_mem;