On Fri, May 03, 2024 at 12:57:18PM +0000, Imseih (AWS), Sami wrote: >> That's true, but using a hard-coded limit means we no longer need to add a >> new GUC. Always allocating, say, 256 slots might require a few additional >> kilobytes of shared memory, most of which will go unused, but that seems >> unlikely to be a problem for the systems that will run Postgres v18. > > I agree with this.
Here's what this might look like. I chose an upper limit of 1024, which seems like it "ought to be enough for anybody," at least for now. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
>From 72e0496294ef0390c77cef8031ae51c1a44ebde8 Mon Sep 17 00:00:00 2001 From: Nathan Bossart <nat...@postgresql.org> Date: Tue, 7 May 2024 10:59:24 -0500 Subject: [PATCH v3 1/1] allow changing autovacuum_max_workers without restarting --- doc/src/sgml/config.sgml | 3 +- doc/src/sgml/runtime.sgml | 15 ++++--- src/backend/access/transam/xlog.c | 2 +- src/backend/postmaster/autovacuum.c | 44 ++++++++++++------- src/backend/postmaster/postmaster.c | 2 +- src/backend/storage/lmgr/proc.c | 9 ++-- src/backend/utils/init/postinit.c | 20 ++------- src/backend/utils/misc/guc_tables.c | 7 ++- src/backend/utils/misc/postgresql.conf.sample | 1 - src/include/postmaster/autovacuum.h | 8 ++++ src/include/utils/guc_hooks.h | 2 - 11 files changed, 58 insertions(+), 55 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index e93208b2e6..8e2a1d6902 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -8528,7 +8528,8 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; <para> Specifies the maximum number of autovacuum processes (other than the autovacuum launcher) that may be running at any one time. The default - is three. This parameter can only be set at server start. + is three. This parameter can only be set in the + <filename>postgresql.conf</filename> file or on the server command line. </para> </listitem> </varlistentry> diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml index 6047b8171d..8a672a8383 100644 --- a/doc/src/sgml/runtime.sgml +++ b/doc/src/sgml/runtime.sgml @@ -781,13 +781,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such <row> <entry><varname>SEMMNI</varname></entry> <entry>Maximum number of semaphore identifiers (i.e., sets)</entry> - <entry>at least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16)</literal> plus room for other applications</entry> + <entry>at least <literal>ceil((max_connections + max_wal_senders + max_worker_processes + 1029) / 16)</literal> plus room for other applications</entry> </row> <row> <entry><varname>SEMMNS</varname></entry> <entry>Maximum number of semaphores system-wide</entry> - <entry><literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16) * 17</literal> plus room for other applications</entry> + <entry><literal>ceil((max_connections + max_wal_senders + max_worker_processes + 1029) / 16) * 17</literal> plus room for other applications</entry> </row> <row> @@ -838,7 +838,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such When using System V semaphores, <productname>PostgreSQL</productname> uses one semaphore per allowed connection (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process - (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background + (1024) and allowed background process (<xref linkend="guc-max-worker-processes"/>), in sets of 16. Each such set will also contain a 17th semaphore which contains a <quote>magic @@ -846,13 +846,14 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such other applications. The maximum number of semaphores in the system is set by <varname>SEMMNS</varname>, which consequently must be at least as high as <varname>max_connections</varname> plus - <varname>autovacuum_max_workers</varname> plus <varname>max_wal_senders</varname>, - plus <varname>max_worker_processes</varname>, plus one extra for each 16 + <varname>max_wal_senders</varname>, + plus <varname>max_worker_processes</varname>, plus 1024 for autovacuum + worker processes, plus one extra for each 16 allowed connections plus workers (see the formula in <xref linkend="sysvipc-parameters"/>). The parameter <varname>SEMMNI</varname> determines the limit on the number of semaphore sets that can exist on the system at one time. Hence this parameter must be at - least <literal>ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16)</literal>. + least <literal>ceil((max_connections + max_wal_senders + max_worker_processes + 1029) / 16)</literal>. Lowering the number of allowed connections is a temporary workaround for failures, which are usually confusingly worded <quote>No space @@ -883,7 +884,7 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such When using POSIX semaphores, the number of semaphores needed is the same as for System V, that is one semaphore per allowed connection (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process - (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background + (1024) and allowed background process (<xref linkend="guc-max-worker-processes"/>). On the platforms where this option is preferred, there is no specific kernel limit on the number of POSIX semaphores. diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index c3fd9c1eae..5b0312b2a6 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -5362,7 +5362,7 @@ CheckRequiredParameterValues(void) */ if (ArchiveRecoveryRequested && EnableHotStandby) { - /* We ignore autovacuum_max_workers when we make this test. */ + /* We ignore autovacuum workers when we make this test. */ RecoveryRequiresIntParameter("max_connections", MaxConnections, ControlFile->MaxConnections); diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 9a925a10cd..1ab97b9903 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -208,8 +208,7 @@ typedef struct autovac_table /*------------- * This struct holds information about a single worker's whereabouts. We keep - * an array of these in shared memory, sized according to - * autovacuum_max_workers. + * an array of these in shared memory. * * wi_links entry into free list or running list * wi_dboid OID of the database this worker is supposed to work on @@ -289,7 +288,7 @@ typedef struct { sig_atomic_t av_signal[AutoVacNumSignals]; pid_t av_launcherpid; - dlist_head av_freeWorkers; + dclist_head av_freeWorkers; dlist_head av_runningWorkers; WorkerInfo av_startingWorker; AutoVacuumWorkItem av_workItems[NUM_WORKITEMS]; @@ -347,6 +346,7 @@ static void autovac_report_activity(autovac_table *tab); static void autovac_report_workitem(AutoVacuumWorkItem *workitem, const char *nspname, const char *relname); static void avl_sigusr2_handler(SIGNAL_ARGS); +static bool av_worker_available(void); @@ -575,8 +575,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len) * wakening conditions. */ - launcher_determine_sleep(!dlist_is_empty(&AutoVacuumShmem->av_freeWorkers), - false, &nap); + launcher_determine_sleep(av_worker_available(), false, &nap); /* * Wait until naptime expires or we get some type of signal (all the @@ -636,7 +635,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len) current_time = GetCurrentTimestamp(); LWLockAcquire(AutovacuumLock, LW_SHARED); - can_launch = !dlist_is_empty(&AutoVacuumShmem->av_freeWorkers); + can_launch = av_worker_available(); if (AutoVacuumShmem->av_startingWorker != NULL) { @@ -679,8 +678,8 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len) worker->wi_sharedrel = false; worker->wi_proc = NULL; worker->wi_launchtime = 0; - dlist_push_head(&AutoVacuumShmem->av_freeWorkers, - &worker->wi_links); + dclist_push_head(&AutoVacuumShmem->av_freeWorkers, + &worker->wi_links); AutoVacuumShmem->av_startingWorker = NULL; ereport(WARNING, errmsg("autovacuum worker took too long to start; canceled")); @@ -1087,7 +1086,7 @@ do_start_worker(void) /* return quickly when there are no free workers */ LWLockAcquire(AutovacuumLock, LW_SHARED); - if (dlist_is_empty(&AutoVacuumShmem->av_freeWorkers)) + if (!av_worker_available()) { LWLockRelease(AutovacuumLock); return InvalidOid; @@ -1240,7 +1239,7 @@ do_start_worker(void) * Get a worker entry from the freelist. We checked above, so there * really should be a free slot. */ - wptr = dlist_pop_head_node(&AutoVacuumShmem->av_freeWorkers); + wptr = dclist_pop_head_node(&AutoVacuumShmem->av_freeWorkers); worker = dlist_container(WorkerInfoData, wi_links, wptr); worker->wi_dboid = avdb->adw_datid; @@ -1609,8 +1608,8 @@ FreeWorkerInfo(int code, Datum arg) MyWorkerInfo->wi_proc = NULL; MyWorkerInfo->wi_launchtime = 0; pg_atomic_clear_flag(&MyWorkerInfo->wi_dobalance); - dlist_push_head(&AutoVacuumShmem->av_freeWorkers, - &MyWorkerInfo->wi_links); + dclist_push_head(&AutoVacuumShmem->av_freeWorkers, + &MyWorkerInfo->wi_links); /* not mine anymore */ MyWorkerInfo = NULL; @@ -3265,7 +3264,7 @@ AutoVacuumShmemSize(void) */ size = sizeof(AutoVacuumShmemStruct); size = MAXALIGN(size); - size = add_size(size, mul_size(autovacuum_max_workers, + size = add_size(size, mul_size(AUTOVAC_MAX_WORKER_SLOTS, sizeof(WorkerInfoData))); return size; } @@ -3292,7 +3291,7 @@ AutoVacuumShmemInit(void) Assert(!found); AutoVacuumShmem->av_launcherpid = 0; - dlist_init(&AutoVacuumShmem->av_freeWorkers); + dclist_init(&AutoVacuumShmem->av_freeWorkers); dlist_init(&AutoVacuumShmem->av_runningWorkers); AutoVacuumShmem->av_startingWorker = NULL; memset(AutoVacuumShmem->av_workItems, 0, @@ -3302,10 +3301,10 @@ AutoVacuumShmemInit(void) MAXALIGN(sizeof(AutoVacuumShmemStruct))); /* initialize the WorkerInfo free list */ - for (i = 0; i < autovacuum_max_workers; i++) + for (i = 0; i < AUTOVAC_MAX_WORKER_SLOTS; i++) { - dlist_push_head(&AutoVacuumShmem->av_freeWorkers, - &worker[i].wi_links); + dclist_push_head(&AutoVacuumShmem->av_freeWorkers, + &worker[i].wi_links); pg_atomic_init_flag(&worker[i].wi_dobalance); } @@ -3341,3 +3340,14 @@ check_autovacuum_work_mem(int *newval, void **extra, GucSource source) return true; } + +/* + * Returns whether there is a free autovacuum worker slot available. + */ +static bool +av_worker_available(void) +{ + int reserved_slots = AUTOVAC_MAX_WORKER_SLOTS - autovacuum_max_workers; + + return dclist_count(&AutoVacuumShmem->av_freeWorkers) > reserved_slots; +} diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 7f3170a8f0..cab8a8a5f4 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -4144,7 +4144,7 @@ CreateOptsFile(int argc, char *argv[], char *fullprogname) int MaxLivePostmasterChildren(void) { - return 2 * (MaxConnections + autovacuum_max_workers + 1 + + return 2 * (MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1 + max_wal_senders + max_worker_processes); } diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index e4f256c63c..fdf7373232 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -141,9 +141,8 @@ ProcGlobalSemas(void) * running out when trying to start another backend is a common failure. * So, now we grab enough semaphores to support the desired max number * of backends immediately at initialization --- if the sysadmin has set - * MaxConnections, max_worker_processes, max_wal_senders, or - * autovacuum_max_workers higher than his kernel will support, he'll - * find out sooner rather than later. + * MaxConnections, max_worker_processes, or max_wal_senders higher than + * his kernel will support, he'll find out sooner rather than later. * * Another reason for creating semaphores here is that the semaphore * implementation typically requires us to create semaphores in the @@ -242,13 +241,13 @@ InitProcGlobal(void) dlist_push_tail(&ProcGlobal->freeProcs, &proc->links); proc->procgloballist = &ProcGlobal->freeProcs; } - else if (i < MaxConnections + autovacuum_max_workers + 1) + else if (i < MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1) { /* PGPROC for AV launcher/worker, add to autovacFreeProcs list */ dlist_push_tail(&ProcGlobal->autovacFreeProcs, &proc->links); proc->procgloballist = &ProcGlobal->autovacFreeProcs; } - else if (i < MaxConnections + autovacuum_max_workers + 1 + max_worker_processes) + else if (i < MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1 + max_worker_processes) { /* PGPROC for bgworker, add to bgworkerFreeProcs list */ dlist_push_tail(&ProcGlobal->bgworkerFreeProcs, &proc->links); diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c index 0805398e24..684b22e3a5 100644 --- a/src/backend/utils/init/postinit.c +++ b/src/backend/utils/init/postinit.c @@ -577,7 +577,7 @@ InitializeMaxBackends(void) Assert(MaxBackends == 0); /* the extra unit accounts for the autovacuum launcher */ - MaxBackends = MaxConnections + autovacuum_max_workers + 1 + + MaxBackends = MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1 + max_worker_processes + max_wal_senders; /* internal error because the values were all checked previously */ @@ -591,19 +591,7 @@ InitializeMaxBackends(void) bool check_max_connections(int *newval, void **extra, GucSource source) { - if (*newval + autovacuum_max_workers + 1 + - max_worker_processes + max_wal_senders > MAX_BACKENDS) - return false; - return true; -} - -/* - * GUC check_hook for autovacuum_max_workers - */ -bool -check_autovacuum_max_workers(int *newval, void **extra, GucSource source) -{ - if (MaxConnections + *newval + 1 + + if (*newval + AUTOVAC_MAX_WORKER_SLOTS + 1 + max_worker_processes + max_wal_senders > MAX_BACKENDS) return false; return true; @@ -615,7 +603,7 @@ check_autovacuum_max_workers(int *newval, void **extra, GucSource source) bool check_max_worker_processes(int *newval, void **extra, GucSource source) { - if (MaxConnections + autovacuum_max_workers + 1 + + if (MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1 + *newval + max_wal_senders > MAX_BACKENDS) return false; return true; @@ -627,7 +615,7 @@ check_max_worker_processes(int *newval, void **extra, GucSource source) bool check_max_wal_senders(int *newval, void **extra, GucSource source) { - if (MaxConnections + autovacuum_max_workers + 1 + + if (MaxConnections + AUTOVAC_MAX_WORKER_SLOTS + 1 + max_worker_processes + *newval > MAX_BACKENDS) return false; return true; diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c index ea2b0577bc..1e255fe263 100644 --- a/src/backend/utils/misc/guc_tables.c +++ b/src/backend/utils/misc/guc_tables.c @@ -3380,14 +3380,13 @@ struct config_int ConfigureNamesInt[] = NULL, NULL, NULL }, { - /* see max_connections */ - {"autovacuum_max_workers", PGC_POSTMASTER, AUTOVACUUM, + {"autovacuum_max_workers", PGC_SIGHUP, AUTOVACUUM, gettext_noop("Sets the maximum number of simultaneously running autovacuum worker processes."), NULL }, &autovacuum_max_workers, - 3, 1, MAX_BACKENDS, - check_autovacuum_max_workers, NULL, NULL + 3, 1, AUTOVAC_MAX_WORKER_SLOTS, + NULL, NULL, NULL }, { diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index 83d5df8e46..439859e3ff 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -659,7 +659,6 @@ #autovacuum = on # Enable autovacuum subprocess? 'on' # requires track_counts to also be on. #autovacuum_max_workers = 3 # max number of autovacuum subprocesses - # (change requires restart) #autovacuum_naptime = 1min # time between autovacuum runs #autovacuum_vacuum_threshold = 50 # min number of row updates before # vacuum diff --git a/src/include/postmaster/autovacuum.h b/src/include/postmaster/autovacuum.h index cae1e8b329..23a7a729aa 100644 --- a/src/include/postmaster/autovacuum.h +++ b/src/include/postmaster/autovacuum.h @@ -66,4 +66,12 @@ extern bool AutoVacuumRequestWork(AutoVacuumWorkItemType type, extern Size AutoVacuumShmemSize(void); extern void AutoVacuumShmemInit(void); +/* + * Number of autovacuum worker slots to allocate. This is the upper limit of + * autovacuum_max_workers. + * + * NB: This must be less than MAX_BACKENDS. + */ +#define AUTOVAC_MAX_WORKER_SLOTS (1024) + #endif /* AUTOVACUUM_H */ diff --git a/src/include/utils/guc_hooks.h b/src/include/utils/guc_hooks.h index d64dc5fcdb..76346eb05e 100644 --- a/src/include/utils/guc_hooks.h +++ b/src/include/utils/guc_hooks.h @@ -29,8 +29,6 @@ extern bool check_application_name(char **newval, void **extra, GucSource source); extern void assign_application_name(const char *newval, void *extra); extern const char *show_archive_command(void); -extern bool check_autovacuum_max_workers(int *newval, void **extra, - GucSource source); extern bool check_autovacuum_work_mem(int *newval, void **extra, GucSource source); extern bool check_vacuum_buffer_usage_limit(int *newval, void **extra, -- 2.25.1