On Mon, Apr 3, 2023 at 1:09 AM David Rowley <dgrowle...@gmail.com> wrote: > > On Sat, 1 Apr 2023 at 13:24, Melanie Plageman <melanieplage...@gmail.com> > wrote: > > Your diff LGTM. > > > > Earlier upthread in [1], Bharath had mentioned in a review comment about > > removing the global variables that he would have expected the analogous > > global in analyze.c to also be removed (vac_strategy [and analyze.c also > > has anl_context]). > > > > I looked into doing this, and this is what I found out (see full > > rationale in [2]): > > > > > it is a bit harder to remove it from analyze because acquire_func > > > doesn't take the buffer access strategy as a parameter and > > > acquire_sample_rows uses the vac_context global variable to pass to > > > table_scan_analyze_next_block(). > > > > I don't know if this is worth mentioning in the commit removing the > > other globals? Maybe it will just make it more confusing... > > I did look at that, but it seems a little tricky to make work unless > the AcquireSampleRowsFunc signature was changed. To me, it just does > not seem worth doing that to get rid of the two globals in analyze.c.
Yes, I came to basically the same conclusion. On Mon, Apr 3, 2023 at 7:57 AM David Rowley <dgrowle...@gmail.com> wrote: > > I've now pushed up v8-0004. Can rebase the remaining 2 patches on top > of master again and resend? v9 attached. > On Mon, 3 Apr 2023 at 08:11, Melanie Plageman <melanieplage...@gmail.com> > wrote: > > I still have a few open questions: > > - what the initial value of ring_size for autovacuum should be (see the > > one remaining TODO in the code) > > I assume you're talking about the 256KB BAS_VACUUM one set in > GetAccessStrategy()? I don't think this patch should be doing anything > to change those defaults. Anything that does that should likely have > a new thread and come with analysis or reasoning about why the newly > proposed defaults are better than the old ones. I actually was talking about something much more trivial but a little more confusing. In table_recheck_autovac(), I initialize the autovac_table->at_params.ring_size to the value of the vacuum_buffer_usage_limit guc. However, autovacuum makes its own BufferAccessStrategy object (instead of relying on vacuum() to do it) and passes that in to vacuum(). So, if we wanted autovacuum to disable use of a strategy (and use as many shared buffers as it likes), it would pass in NULL to vacuum(). If vauum_buffer_usage_limit is not 0, then we would end up making and using a BufferAccessStrategy in vacuum(). If we instead initialized autovac_table->at_params.ring_size to 0, even if the passed in BufferAccessStrategy is NULL, we wouldn't make a ring for autovacuum. Right now, we don't disable the strategy for autovacuum except in failsafe mode. And it is unclear when or why we would want to. I also thought it might be weird to have the value of the ring_size be initialized to something other than the value of vacuum_buffer_usage_limit for autovacuum, since it is supposed to use that guc value. In fact, right now, we don't use the autovac_table->at_params.ring_size set in table_recheck_autovac() when making the ring in do_autovacuum() but instead use the guc directly. I actually don't really like how vacuum() relies on the BufferAccessStrategy parameter being NULL for autovacuum and feel like there is a more intuitive way to handle all this. But, I didn't want to make major changes at this point. Anyway, the above is quite a bit more analysis than the issue is really worth. We should pick something and then document it in a comment. > > - should ANALYZE allow specifying BUFFER_USAGE_LIMIT since it uses the guc > > value when that is set? > > That's a good question... I kinda think we should just skip it. It adds to the surface area of the feature. > > - should INDEX_CLEANUP off cause VACUUM to use shared buffers and > > disable use of a strategy (like failsafe vacuum) > > I don't see why it should. It seems strange to have one option > magically make changes to some other option. Sure, sounds good. > > - should we add anything to VACUUM VERBOSE output about the number of > > reuses of strategy buffers? > > Sounds like this would require an extra array of counter variables in > BufferAccessStrategyData? I think it might be a bit late to start > experimenting with this. Makes sense. I hadn't thought through the implementation. We count reuses in pg_stat_io data structures but that is global and not per BufferAccessStrategyData instance, so I agree to scrapping this idea. > > - Should we make BufferAccessStrategyData non-opaque so that we don't > > have to add a getter for nbuffers. I could have implemented this in > > another way, but I don't really see why BufferAccessStrategyData > > should be opaque > > If nothing outside of the .c file requires access then there's little > need to make the members known outside of the file. Same as you'd want > to make classes private rather than public when possible in OOP. > > If you do come up with a reason to be able to determine the size of > the BufferAccessStrategy from outside freelist.c, I'd say an accessor > method is the best way. In the main patch, I wanted access to the number of buffers so that parallel vacuum workers could make their own rings the same size. I added an accessor, but it looked a bit silly so I thought I would ask if we needed to keep the data structure opaque. It isn't called frequently enough to worry about the function call overhead. Though the accessor could use a better name than the one I chose. - Melanie
From 2339872e09b60ec599f171065b4e4597fe8f0f61 Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Sun, 19 Mar 2023 18:00:08 -0400 Subject: [PATCH v9 1/2] Add VACUUM BUFFER_USAGE_LIMIT option and GUC Add GUC, vacuum_buffer_usage_limit, and VACUUM option, BUFFER_USAGE_LIMIT, through which the user can specify the maximum size to use for buffers for VACUUM, ANALYZE, and autovacuum. The size is converted into a number of shared buffers which are tracked in a BufferAccessStrategyData object. The explicit VACUUM option, when specified, overrides the GUC value, unless it is specified as -1. Discussion: https://www.postgresql.org/message-id/flat/20230111182720.ejifsclfwymw2reb%40awork3.anarazel.de --- doc/src/sgml/config.sgml | 30 +++++++ doc/src/sgml/ref/analyze.sgml | 10 ++- doc/src/sgml/ref/vacuum.sgml | 26 ++++++ src/backend/commands/vacuum.c | 62 ++++++++++++- src/backend/commands/vacuumparallel.c | 13 ++- src/backend/postmaster/autovacuum.c | 19 +++- src/backend/storage/buffer/README | 21 +++-- src/backend/storage/buffer/freelist.c | 90 ++++++++++++++++++- src/backend/utils/init/globals.c | 2 + src/backend/utils/misc/guc_tables.c | 11 +++ src/backend/utils/misc/postgresql.conf.sample | 4 + src/bin/psql/tab-complete.c | 2 +- src/include/commands/vacuum.h | 1 + src/include/miscadmin.h | 1 + src/include/storage/bufmgr.h | 8 ++ src/test/regress/expected/vacuum.out | 17 ++++ src/test/regress/sql/vacuum.sql | 13 +++ 17 files changed, 314 insertions(+), 16 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 5bad13d24a..bae2486bcc 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2001,6 +2001,36 @@ include_dir 'conf.d' </listitem> </varlistentry> + <varlistentry id="guc-vacuum-buffer-usage-limit" xreflabel="vacuum_buffer_usage_limit"> + <term> + <varname>vacuum_buffer_usage_limit</varname> (<type>integer</type>) + <indexterm> + <primary><varname>vacuum_buffer_usage_limit</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies the size of <varname>shared_buffers</varname> to be reused + for each backend participating in a given invocation of + <command>VACUUM</command> or <command>ANALYZE</command> or in + autovacuum. This size is converted to the number of shared buffers + which will be reused as part of a <glossterm + linkend="glossary-buffer-access-strategy">Buffer Access + Strategy</glossterm>. <literal>0</literal> will disable use of a + <literal>Buffer Access Strategy</literal>. <literal>-1</literal> will + set the size to a default of <literal>256 kB</literal>. The maximum + ring buffer size is <literal>16 GB</literal>. Though you may set + <varname>vacuum_buffer_usage_limit</varname> below <literal>128 + kB</literal>, it will be clamped to <literal>128 kB</literal> at + runtime. The default value is <literal>-1</literal>. If this value is + specified without units, it is taken as kilobytes. This parameter can + be set at any time. It can be overridden for <link + linkend="sql-vacuum"><command>VACUUM</command></link> when passing the + <command>BUFFER_USAGE_LIMIT</command> parameter. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-logical-decoding-work-mem" xreflabel="logical_decoding_work_mem"> <term><varname>logical_decoding_work_mem</varname> (<type>integer</type>) <indexterm> diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml index 2f94e89cb0..019d34423a 100644 --- a/doc/src/sgml/ref/analyze.sgml +++ b/doc/src/sgml/ref/analyze.sgml @@ -51,9 +51,13 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea Without a <replaceable class="parameter">table_and_columns</replaceable> list, <command>ANALYZE</command> processes every table and materialized view in the current database that the current user has permission to analyze. - With a list, <command>ANALYZE</command> processes only those table(s). - It is further possible to give a list of column names for a table, - in which case only the statistics for those columns are collected. + With a list, <command>ANALYZE</command> processes only those table(s). It is + further possible to give a list of column names for a table, in which case + only the statistics for those columns are collected. + <command>ANALYZE</command> uses a <glossterm + linkend="glossary-buffer-access-strategy">Buffer Access Strategy</glossterm> + when reading in the sample data. The number of buffers consumed for this can + be controlled by <xref linkend="guc-vacuum-buffer-usage-limit"/>. </para> <para> diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml index b6d30b5764..df21a35ef5 100644 --- a/doc/src/sgml/ref/vacuum.sgml +++ b/doc/src/sgml/ref/vacuum.sgml @@ -39,6 +39,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet PARALLEL <replaceable class="parameter">integer</replaceable> SKIP_DATABASE_STATS [ <replaceable class="parameter">boolean</replaceable> ] ONLY_DATABASE_STATS [ <replaceable class="parameter">boolean</replaceable> ] + BUFFER_USAGE_LIMIT [ <replaceable class="parameter">string</replaceable> ] <phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase> @@ -345,6 +346,31 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet </listitem> </varlistentry> + <varlistentry> + <term><literal>BUFFER_USAGE_LIMIT</literal></term> + <listitem> + <para> + Specifies the ring buffer size for <command>VACUUM</command>. This size + is used to calculate the number of shared buffers which will be reused as + part of a <glossterm linkend="glossary-buffer-access-strategy">Buffer + Access Strategy</glossterm>. <literal>0</literal> disables use of a + <literal>Buffer Access Strategy</literal>. <literal>-1</literal> + indicates that <command>VACUUM</command> should fall back to the value + specified by <xref linkend="guc-vacuum-buffer-usage-limit"/>. The maximum + value is <literal>16 GB</literal>. Though you may specify a size smaller + than <literal>128</literal>, the value will be clamped to <literal>128 + kB</literal> at runtime. If this value is specified without units, it is + taken as kilobytes. This size applies to a backend participating in a + single invocation of <command>VACUUM</command>. This option can't be used + with the <literal>FULL</literal> option or + <literal>ONLY_DATABASE_STATS</literal> option. If + <literal>ANALYZE</literal> is also specified, the + <literal>BUFFER_USAGE_LIMIT</literal> value is used for both the vacuum + and analyze stages. + </para> + </listitem> + </varlistentry> + <varlistentry> <term><replaceable class="parameter">boolean</replaceable></term> <listitem> diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index da85330ef4..d060c6e288 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -124,6 +124,9 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel) /* By default parallel vacuum is enabled */ params.nworkers = 0; + /* by default use buffer access strategy with default size */ + params.ring_size = -1; + /* Parse options list */ foreach(lc, vacstmt->options) { @@ -207,6 +210,42 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel) skip_database_stats = defGetBoolean(opt); else if (strcmp(opt->defname, "only_database_stats") == 0) only_database_stats = defGetBoolean(opt); + else if (strcmp(opt->defname, "buffer_usage_limit") == 0) + { + char *vac_buffer_size; + int result; + const char *hintmsg; + + if (opt->arg == NULL) + { + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("buffer_usage_limit option requires a valid value"), + parser_errposition(pstate, opt->location))); + } + + vac_buffer_size = defGetString(opt); + + if (!parse_int(vac_buffer_size, &result, GUC_UNIT_KB, &hintmsg)) + { + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("value: \"%s\": is invalid for buffer_usage_limit", + vac_buffer_size), + hintmsg ? errhint("%s", _(hintmsg)) : 0)); + } + + /* check for out-of-bounds */ + if (result < -1 || result > MAX_BAS_RING_SIZE_KB) + { + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("buffer_usage_limit for a vacuum must be between -1 and %d", + MAX_BAS_RING_SIZE_KB))); + } + + params.ring_size = result; + } else ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), @@ -238,6 +277,16 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel) (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("VACUUM FULL cannot be performed in parallel"))); + if ((params.options & VACOPT_FULL) && params.ring_size > -1) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("BUFFER_USAGE_LIMIT cannot be specified for VACUUM FULL"))); + + if ((params.options & VACOPT_ONLY_DATABASE_STATS) && params.ring_size > -1) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("BUFFER_USAGE_LIMIT cannot be specified for VACUUM ONLY_DATABASE_STATS"))); + /* * Make sure VACOPT_ANALYZE is specified if any column lists are present. */ @@ -401,7 +450,18 @@ vacuum(List *relations, VacuumParams *params, { MemoryContext old_context = MemoryContextSwitchTo(vac_context); - bstrategy = GetAccessStrategy(BAS_VACUUM); + Assert(params->ring_size >= -1); + + if (params->ring_size == -1) + { + if (vacuum_buffer_usage_limit == -1) + bstrategy = GetAccessStrategy(BAS_VACUUM); + else + bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, vacuum_buffer_usage_limit); + } + else + bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, params->ring_size); + MemoryContextSwitchTo(old_context); } diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c index 2cdbd182b6..10f54377d4 100644 --- a/src/backend/commands/vacuumparallel.c +++ b/src/backend/commands/vacuumparallel.c @@ -87,6 +87,12 @@ typedef struct PVShared */ int maintenance_work_mem_worker; + /* + * The number of buffers each worker's Buffer Access Strategy ring should + * contain. + */ + int ring_nbuffers; + /* * Shared vacuum cost balance. During parallel vacuum, * VacuumSharedCostBalance points to this value and it accumulates the @@ -365,6 +371,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes, maintenance_work_mem / Min(parallel_workers, nindexes_mwm) : maintenance_work_mem; + /* Use the same buffer size for all workers */ + shared->ring_nbuffers = bas_nbuffers(bstrategy); + pg_atomic_init_u32(&(shared->cost_balance), 0); pg_atomic_init_u32(&(shared->active_nworkers), 0); pg_atomic_init_u32(&(shared->idx), 0); @@ -1018,8 +1027,8 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc) pvs.indname = NULL; pvs.status = PARALLEL_INDVAC_STATUS_INITIAL; - /* Each parallel VACUUM worker gets its own access strategy */ - pvs.bstrategy = GetAccessStrategy(BAS_VACUUM); + /* Each parallel VACUUM worker gets its own access strategy. */ + pvs.bstrategy = GetAccessStrategyWithNBuffers(BAS_VACUUM, shared->ring_nbuffers); /* Setup error traceback support for ereport() */ errcallback.callback = parallel_vacuum_error_callback; diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 585d28148c..ce54535692 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -2290,9 +2290,15 @@ do_autovacuum(void) /* * Create a buffer access strategy object for VACUUM to use. We want to * use the same one across all the vacuum operations we perform, since the - * point is for VACUUM not to blow out the shared cache. + * point is for VACUUM not to blow out the shared cache. If we later enter + * failsafe mode, we will cease use of the BufferAccessStrategy. Either + * way, we clean up the BufferAccessStrategy object at the end of this + * function. */ - bstrategy = GetAccessStrategy(BAS_VACUUM); + if (vacuum_buffer_usage_limit == -1) + bstrategy = GetAccessStrategy(BAS_VACUUM); + else + bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, vacuum_buffer_usage_limit); /* * create a memory context to act as fake PortalContext, so that the @@ -2884,6 +2890,15 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map, tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age; tab->at_params.is_wraparound = wraparound; tab->at_params.log_min_duration = log_min_duration; + + /* + * TODO: should this be 0 so that we are sure that vacuum() never + * allocates a new bstrategy for us, even if we pass in NULL for that + * parameter? maybe could change how failsafe NULLs out bstrategy if + * so? + */ + tab->at_params.ring_size = vacuum_buffer_usage_limit; + tab->at_vacuum_cost_limit = vac_cost_limit; tab->at_vacuum_cost_delay = vac_cost_delay; tab->at_relname = NULL; diff --git a/src/backend/storage/buffer/README b/src/backend/storage/buffer/README index a775276ff2..d1be1ca5b7 100644 --- a/src/backend/storage/buffer/README +++ b/src/backend/storage/buffer/README @@ -229,12 +229,21 @@ update hint bits). In a scan that modifies every page in the scan, like a bulk UPDATE or DELETE, the buffers in the ring will always be dirtied and the ring strategy effectively degrades to the normal strategy. -VACUUM uses a 256KB ring like sequential scans, but dirty pages are not -removed from the ring. Instead, WAL is flushed if needed to allow reuse of -the buffers. Before introducing the buffer ring strategy in 8.3, VACUUM's -buffers were sent to the freelist, which was effectively a buffer ring of 1 -buffer, resulting in excessive WAL flushing. Allowing VACUUM to update -256KB between WAL flushes should be more efficient. +VACUUM's default Buffer Access Strategy uses a 256KB ring like sequential +scans, but dirty pages are not removed from the ring. Instead, WAL is flushed +if needed to allow reuse of the buffers. Before introducing the buffer ring +strategy in 8.3, VACUUM's buffers were sent to the freelist, which was +effectively a buffer ring of 1 buffer, resulting in excessive WAL flushing. +Allowing VACUUM to update 256KB between WAL flushes should be more efficient. + +As an alternative, VACUUM can use a user-specified ring size. The VACUUM +parameter "BUFFER_USAGE_LIMIT" and GUC vacuum_buffer_usage_limit can be used to +specify the amount of shared memory to be used during vacuuming. This size is +used to calculate the number of buffers in the ring when it is created. A value +of 0 for vacuum_buffer_usage_limit will disable use of the Buffer Access +Strategy and allow vacuuming to use shared buffers as normal. +In failsafe mode, autovacuum will always abandon use of a Buffer Access +Strategy. Bulk writes work similarly to VACUUM. Currently this applies only to COPY IN and CREATE TABLE AS SELECT. (Might it be interesting to make diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c index f122709fbe..7a0637c4e2 100644 --- a/src/backend/storage/buffer/freelist.c +++ b/src/backend/storage/buffer/freelist.c @@ -163,6 +163,15 @@ ClockSweepTick(void) return victim; } +/* + * bas_nbuffers -- an accessory for the number of buffers in the ring + */ +int +bas_nbuffers(BufferAccessStrategy strategy) +{ + return strategy->nbuffers; +} + /* * have_free_buffer -- a lockless check to see if there is a free buffer in * buffer pool. @@ -531,6 +540,19 @@ StrategyInitialize(bool init) * ---------------------------------------------------------------- */ +/* + * Helper to convert a size to a number of buffers. + */ +static inline int +bufsize_limit_to_nbuffers(int bufsize_limit_kb) +{ + int blcksz_kb = BLCKSZ / 1024; + + Assert(blcksz_kb > 0); + + return bufsize_limit_kb / blcksz_kb; +} + /* * GetAccessStrategy -- create a BufferAccessStrategy object @@ -540,7 +562,6 @@ StrategyInitialize(bool init) BufferAccessStrategy GetAccessStrategy(BufferAccessStrategyType btype) { - BufferAccessStrategy strategy; int nbuffers; /* @@ -574,6 +595,23 @@ GetAccessStrategy(BufferAccessStrategyType btype) /* Make sure ring isn't an undue fraction of shared buffers */ nbuffers = Min(NBuffers / 8, nbuffers); + return GetAccessStrategyWithNBuffers(btype, nbuffers); +} + + +/* + * GetAccessStrategyWithNBuffers -- create a BufferAccessStrategy object with + * space for the passed-in number of buffers + * + * Also used as a helper for the other GetAccessStrategy* methods. + */ +BufferAccessStrategy +GetAccessStrategyWithNBuffers(BufferAccessStrategyType btype, int nbuffers) +{ + BufferAccessStrategy strategy; + + Assert(nbuffers > 0); + /* Allocate the object and initialize all elements to zeroes */ strategy = (BufferAccessStrategy) palloc0(offsetof(BufferAccessStrategyData, buffers) + @@ -586,6 +624,56 @@ GetAccessStrategy(BufferAccessStrategyType btype) return strategy; } + +/* + * GetAccessStrategyWithSize -- create a BufferAccessStrategy object with a + * number of buffers equivalent to the passed in size + */ +BufferAccessStrategy +GetAccessStrategyWithSize(BufferAccessStrategyType btype, int ring_size) +{ + int nbuffers; + int clamped_nbuffers; + + /* Default nbuffers should have resulted in calling GetAccessStrategy() */ + Assert(ring_size >= 0); + + if (ring_size == 0) + return NULL; + + Assert(ring_size <= MAX_BAS_RING_SIZE_KB); + + if (ring_size < MIN_BAS_RING_SIZE_KB) + { + ereport(DEBUG1, + (errmsg_internal("Buffer Access Strategy ring_size %d kB has been clamped to minimum %d kB", + ring_size, + MIN_BAS_RING_SIZE_KB))); + + nbuffers = bufsize_limit_to_nbuffers(MIN_BAS_RING_SIZE_KB); + } + else + nbuffers = bufsize_limit_to_nbuffers(ring_size); + + clamped_nbuffers = Min(NBuffers / 8, nbuffers); + + /* + * Though default GetAccessStrategy() may also clamp the number of + * buffers, only bother warning the user when the input size was + * user-specified. + */ + if (clamped_nbuffers < nbuffers) + ereport(DEBUG1, + (errmsg_internal("active Buffer Access Strategy may use a maximum of %d buffers. %d has been clamped", + NBuffers / 8, + nbuffers))); + + nbuffers = clamped_nbuffers; + + return GetAccessStrategyWithNBuffers(btype, nbuffers); +} + + /* * FreeAccessStrategy -- release a BufferAccessStrategy object * diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c index 1b1d814254..6eca3371bd 100644 --- a/src/backend/utils/init/globals.c +++ b/src/backend/utils/init/globals.c @@ -139,6 +139,8 @@ int max_worker_processes = 8; int max_parallel_workers = 8; int MaxBackends = 0; +int vacuum_buffer_usage_limit = -1; + int VacuumCostPageHit = 1; /* GUC parameters for vacuum */ int VacuumCostPageMiss = 2; int VacuumCostPageDirty = 20; diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c index 8062589efd..5476c0f4aa 100644 --- a/src/backend/utils/misc/guc_tables.c +++ b/src/backend/utils/misc/guc_tables.c @@ -2224,6 +2224,17 @@ struct config_int ConfigureNamesInt[] = NULL, NULL, NULL }, + { + {"vacuum_buffer_usage_limit", PGC_USERSET, RESOURCES_MEM, + gettext_noop("Sets the buffer pool size for VACUUM and autovacuum."), + NULL, + GUC_UNIT_KB + }, + &vacuum_buffer_usage_limit, + -1, -1, MAX_BAS_RING_SIZE_KB, + NULL, NULL, NULL + }, + { {"shared_memory_size", PGC_INTERNAL, PRESET_OPTIONS, gettext_noop("Shows the size of the server's main shared memory area (rounded up to the nearest MB)."), diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index ee49ca3937..91599a4975 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -157,6 +157,10 @@ # mmap # (change requires restart) #min_dynamic_shared_memory = 0MB # (change requires restart) +#vacuum_buffer_usage_limit = -1 # size of vacuum buffer access strategy ring. + # -1 to use default, + # 0 to disable vacuum buffer access strategy + # > 0 to specify size # - Disk - diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index e38a49e8bd..26947f7928 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -4620,7 +4620,7 @@ psql_completion(const char *text, int start, int end) "DISABLE_PAGE_SKIPPING", "SKIP_LOCKED", "INDEX_CLEANUP", "PROCESS_MAIN", "PROCESS_TOAST", "TRUNCATE", "PARALLEL", "SKIP_DATABASE_STATS", - "ONLY_DATABASE_STATS"); + "ONLY_DATABASE_STATS", "BUFFER_USAGE_LIMIT"); else if (TailMatches("FULL|FREEZE|ANALYZE|VERBOSE|DISABLE_PAGE_SKIPPING|SKIP_LOCKED|PROCESS_MAIN|PROCESS_TOAST|TRUNCATE|SKIP_DATABASE_STATS|ONLY_DATABASE_STATS")) COMPLETE_WITH("ON", "OFF"); else if (TailMatches("INDEX_CLEANUP")) diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index bdfd96cfec..5f2a58b2c3 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -236,6 +236,7 @@ typedef struct VacuumParams * disabled. */ int nworkers; + int ring_size; } VacuumParams; /* diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h index 06a86f9ac1..b572dfcc6c 100644 --- a/src/include/miscadmin.h +++ b/src/include/miscadmin.h @@ -262,6 +262,7 @@ extern PGDLLIMPORT int work_mem; extern PGDLLIMPORT double hash_mem_multiplier; extern PGDLLIMPORT int maintenance_work_mem; extern PGDLLIMPORT int max_parallel_maintenance_workers; +extern PGDLLIMPORT int vacuum_buffer_usage_limit; extern PGDLLIMPORT int VacuumCostPageHit; extern PGDLLIMPORT int VacuumCostPageMiss; diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h index 73762cb1ec..41329bf416 100644 --- a/src/include/storage/bufmgr.h +++ b/src/include/storage/bufmgr.h @@ -101,6 +101,9 @@ extern PGDLLIMPORT int32 *LocalRefCount; /* upper limit for effective_io_concurrency */ #define MAX_IO_CONCURRENCY 1000 +#define MAX_BAS_RING_SIZE_KB (16 * 1024 * 1024) +#define MIN_BAS_RING_SIZE_KB 128 + /* special block number for ReadBuffer() */ #define P_NEW InvalidBlockNumber /* grow the file to get a new page */ @@ -194,7 +197,12 @@ extern Size BufferShmemSize(void); extern void AtProcExit_LocalBuffers(void); /* in freelist.c */ +extern int bas_nbuffers(BufferAccessStrategy strategy); + extern BufferAccessStrategy GetAccessStrategy(BufferAccessStrategyType btype); +extern BufferAccessStrategy GetAccessStrategyWithNBuffers(BufferAccessStrategyType btype, int nbuffers); +extern BufferAccessStrategy GetAccessStrategyWithSize(BufferAccessStrategyType btype, int nbuffers); + extern void FreeAccessStrategy(BufferAccessStrategy strategy); diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out index e5a312182e..7542129f52 100644 --- a/src/test/regress/expected/vacuum.out +++ b/src/test/regress/expected/vacuum.out @@ -350,6 +350,23 @@ SELECT t.relfilenode = :toast_filenode AS is_same_toast_filenode f (1 row) +-- BUFFER_USAGE_LIMIT option +VACUUM (BUFFER_USAGE_LIMIT '512 kB') vac_option_tab; +-- works with PARALLEL option +VACUUM (BUFFER_USAGE_LIMIT '512 kB', PARALLEL 2) vac_option_tab; +-- integer overflow error +VACUUM (BUFFER_USAGE_LIMIT 10000000000) vac_option_tab; +ERROR: value: "10000000000": is invalid for buffer_usage_limit +HINT: Value exceeds integer range. +-- value exceeds ring size max error +VACUUM (BUFFER_USAGE_LIMIT '17 GB') vac_option_tab; +ERROR: buffer_usage_limit for a vacuum must be between -1 and 16777216 +-- incompatible with VACUUM FULL error +VACUUM (BUFFER_USAGE_LIMIT '512 kB', FULL) vac_option_tab; +ERROR: BUFFER_USAGE_LIMIT cannot be specified for VACUUM FULL +-- incompatible with VACUUM ONLY_DATABASE_STATS error +VACUUM (BUFFER_USAGE_LIMIT '512 kB', ONLY_DATABASE_STATS) vac_option_tab; +ERROR: BUFFER_USAGE_LIMIT cannot be specified for VACUUM ONLY_DATABASE_STATS -- SKIP_DATABASE_STATS option VACUUM (SKIP_DATABASE_STATS) vactst; -- ONLY_DATABASE_STATS option diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql index a1fad43657..a3a99ff56c 100644 --- a/src/test/regress/sql/vacuum.sql +++ b/src/test/regress/sql/vacuum.sql @@ -272,6 +272,19 @@ SELECT t.relfilenode = :toast_filenode AS is_same_toast_filenode FROM pg_class c, pg_class t WHERE c.reltoastrelid = t.oid AND c.relname = 'vac_option_tab'; +-- BUFFER_USAGE_LIMIT option +VACUUM (BUFFER_USAGE_LIMIT '512 kB') vac_option_tab; +-- works with PARALLEL option +VACUUM (BUFFER_USAGE_LIMIT '512 kB', PARALLEL 2) vac_option_tab; +-- integer overflow error +VACUUM (BUFFER_USAGE_LIMIT 10000000000) vac_option_tab; +-- value exceeds ring size max error +VACUUM (BUFFER_USAGE_LIMIT '17 GB') vac_option_tab; +-- incompatible with VACUUM FULL error +VACUUM (BUFFER_USAGE_LIMIT '512 kB', FULL) vac_option_tab; +-- incompatible with VACUUM ONLY_DATABASE_STATS error +VACUUM (BUFFER_USAGE_LIMIT '512 kB', ONLY_DATABASE_STATS) vac_option_tab; + -- SKIP_DATABASE_STATS option VACUUM (SKIP_DATABASE_STATS) vactst; -- 2.37.2
From 46df3fa693ac9aabb8f3d4999aeeb08170dd049e Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Sun, 19 Mar 2023 18:00:28 -0400 Subject: [PATCH v9 2/2] Add buffer-usage-limit option to vacuumdb --- doc/src/sgml/ref/vacuumdb.sgml | 12 ++++++++++++ src/bin/scripts/vacuumdb.c | 23 +++++++++++++++++++++++ src/include/commands/vacuum.h | 3 +++ 3 files changed, 38 insertions(+) diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml index 74bac2d4ba..8cedef8d79 100644 --- a/doc/src/sgml/ref/vacuumdb.sgml +++ b/doc/src/sgml/ref/vacuumdb.sgml @@ -278,6 +278,18 @@ PostgreSQL documentation </listitem> </varlistentry> + <varlistentry> + <term><option>--buffer-usage-limit <replaceable class="parameter">buffer_usage_limit</replaceable></option></term> + <listitem> + <para> + The size of the ring buffer used for vacuuming. This size is used to + calculate the number of shared buffers which will be reused as part of + a <glossterm linkend="glossary-buffer-access-strategy">Buffer Access + Strategy</glossterm>. See <xref linkend="sql-vacuum"/>. + </para> + </listitem> + </varlistentry> + <varlistentry> <term><option>-n <replaceable class="parameter">schema</replaceable></option></term> <term><option>--schema=<replaceable class="parameter">schema</replaceable></option></term> diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c index 39be265b5b..941eac7727 100644 --- a/src/bin/scripts/vacuumdb.c +++ b/src/bin/scripts/vacuumdb.c @@ -46,6 +46,7 @@ typedef struct vacuumingOptions bool process_main; bool process_toast; bool skip_database_stats; + char *buffer_usage_limit; } vacuumingOptions; /* object filter options */ @@ -123,6 +124,7 @@ main(int argc, char *argv[]) {"no-truncate", no_argument, NULL, 10}, {"no-process-toast", no_argument, NULL, 11}, {"no-process-main", no_argument, NULL, 12}, + {"buffer-usage-limit", required_argument, NULL, 13}, {NULL, 0, NULL, 0} }; @@ -147,6 +149,7 @@ main(int argc, char *argv[]) /* initialize options */ memset(&vacopts, 0, sizeof(vacopts)); vacopts.parallel_workers = -1; + vacopts.buffer_usage_limit = NULL; vacopts.no_index_cleanup = false; vacopts.force_index_cleanup = false; vacopts.do_truncate = true; @@ -266,6 +269,9 @@ main(int argc, char *argv[]) case 12: vacopts.process_main = false; break; + case 13: + vacopts.buffer_usage_limit = pg_strdup(optarg); + break; default: /* getopt_long already emitted a complaint */ pg_log_error_hint("Try \"%s --help\" for more information.", progname); @@ -343,6 +349,11 @@ main(int argc, char *argv[]) pg_fatal("cannot use the \"%s\" option with the \"%s\" option", "no-index-cleanup", "force-index-cleanup"); + /* buffer-usage-limit doesn't make sense with VACUUM FULL */ + if (vacopts.buffer_usage_limit && vacopts.full) + pg_fatal("cannot use the \"%s\" option with the \"%s\" option", + "buffer-usage-limit", "full"); + /* fill cparams except for dbname, which is set below */ cparams.pghost = host; cparams.pgport = port; @@ -550,6 +561,10 @@ vacuum_one_database(ConnParams *cparams, pg_fatal("cannot use the \"%s\" option on server versions older than PostgreSQL %s", "--parallel", "13"); + if (vacopts->buffer_usage_limit && PQserverVersion(conn) < 160000) + pg_fatal("cannot use the \"%s\" option on server versions older than PostgreSQL %s", + "--buffer-usage-limit", "16"); + /* skip_database_stats is used automatically if server supports it */ vacopts->skip_database_stats = (PQserverVersion(conn) >= 160000); @@ -1048,6 +1063,13 @@ prepare_vacuum_command(PQExpBuffer sql, int serverVersion, vacopts->parallel_workers); sep = comma; } + if (vacopts->buffer_usage_limit) + { + Assert(serverVersion >= 160000); + appendPQExpBuffer(sql, "%sBUFFER_USAGE_LIMIT '%s'", sep, + vacopts->buffer_usage_limit); + sep = comma; + } if (sep != paren) appendPQExpBufferChar(sql, ')'); } @@ -1111,6 +1133,7 @@ help(const char *progname) printf(_(" --force-index-cleanup always remove index entries that point to dead tuples\n")); printf(_(" -j, --jobs=NUM use this many concurrent connections to vacuum\n")); printf(_(" --min-mxid-age=MXID_AGE minimum multixact ID age of tables to vacuum\n")); + printf(_(" --buffer-usage-limit=BUFSIZE size of ring buffer used for vacuum\n")); printf(_(" --min-xid-age=XID_AGE minimum transaction ID age of tables to vacuum\n")); printf(_(" --no-index-cleanup don't remove index entries that point to dead tuples\n")); printf(_(" --no-process-main skip the main relation\n")); diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index 5f2a58b2c3..d8fba51d82 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -213,6 +213,9 @@ typedef enum VacOptValue * * Note that at least one of VACOPT_VACUUM and VACOPT_ANALYZE must be set * in options. + * + * When adding a new VacuumParam member, consider adding it to vacuumdb as + * well. */ typedef struct VacuumParams { -- 2.37.2