On Thu, Apr 4, 2024 at 11:13 AM Andres Freund <and...@anarazel.de> wrote: > The / 2 is to avoid causing unnecessarily frequent WAL flushes, right? If so, > should we apply that only if the ring the strategy doesn't use the > StrategyRejectBuffer() logic?
Hmm, I don't really know, but that sounds plausible. What do you think about the attached? > I think for VACUUM we should probably go a bit further. There's no comparable > L1/L2 issue, because the per-buffer processing + WAL insertion is a lot more > expensive, compared to a seqscan. I'd go or at lest 4x-8x. Alright what about this?
From 6dea2983abf8d608c34e02351d70694de99f25f2 Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Thu, 4 Apr 2024 20:31:26 +1300 Subject: [PATCH 1/2] Allow BufferAccessStrategy to limit pin count. When pinning extra buffers to look ahead, users of a strategy are in danger of pinning a lot of the buffers in the ring, or even more than the ring size. For some strategies, that means "escaping" from the ring, and in others it means forcing dirty data to disk very frequently with associated WAL flushing. Since external code has no insight into any of that, allow individual strategy types to expose a clamp that should be applied when deciding how many buffers to pin at once. Discussion: https://postgr.es/m/CAAKRu_aJXnqsyZt6HwFLnxYEBgE17oypkxbKbT1t1geE_wvH2Q%40mail.gmail.com --- src/backend/storage/aio/read_stream.c | 5 ++++ src/backend/storage/buffer/freelist.c | 35 +++++++++++++++++++++++++++ src/include/storage/bufmgr.h | 1 + 3 files changed, 41 insertions(+) diff --git a/src/backend/storage/aio/read_stream.c b/src/backend/storage/aio/read_stream.c index 4f21262ff5..eab87f6f73 100644 --- a/src/backend/storage/aio/read_stream.c +++ b/src/backend/storage/aio/read_stream.c @@ -419,6 +419,7 @@ read_stream_begin_relation(int flags, size_t size; int16 queue_size; int16 max_ios; + int strategy_pin_limit; uint32 max_pinned_buffers; Oid tablespace_id; SMgrRelation smgr; @@ -460,6 +461,10 @@ read_stream_begin_relation(int flags, max_pinned_buffers = Min(max_pinned_buffers, PG_INT16_MAX - io_combine_limit - 1); + /* Give the strategy a chance to limit the number of buffers we pin. */ + strategy_pin_limit = GetAccessStrategyPinLimit(strategy); + max_pinned_buffers = Min(strategy_pin_limit, max_pinned_buffers); + /* Don't allow this backend to pin more than its share of buffers. */ if (SmgrIsTemp(smgr)) LimitAdditionalLocalPins(&max_pinned_buffers); diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c index 3611357fa3..c69590d6d8 100644 --- a/src/backend/storage/buffer/freelist.c +++ b/src/backend/storage/buffer/freelist.c @@ -629,6 +629,41 @@ GetAccessStrategyBufferCount(BufferAccessStrategy strategy) return strategy->nbuffers; } +/* + * GetAccessStrategyPinLimit -- get cap of number of buffers that can be pinned + * + * Strategies can specify the maximum number of buffers that a user should pin + * at once when performing look-ahead. Callers should combine this number with + * other relevant limits and take the minimum. + */ +int +GetAccessStrategyPinLimit(BufferAccessStrategy strategy) +{ + if (strategy == NULL) + return NBuffers; + + switch (strategy->btype) + { + case BAS_BULKREAD: + + /* + * Since BAS_BULKREAD uses StrategyRejectBuffer(), dirty buffers + * shouldn't be a problem and the caller is free to pin up to the + * entire ring at once. + */ + return strategy->nbuffers; + + default: + + /* + * Tell call not to pin more than half the buffers in the ring. + * This is a trade-off between look ahead distance and deferring + * writeback and associated WAL traffic. + */ + return strategy->nbuffers / 2; + } +} + /* * FreeAccessStrategy -- release a BufferAccessStrategy object * diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h index f380f9d9a6..07ba1a6050 100644 --- a/src/include/storage/bufmgr.h +++ b/src/include/storage/bufmgr.h @@ -318,6 +318,7 @@ extern BufferAccessStrategy GetAccessStrategy(BufferAccessStrategyType btype); extern BufferAccessStrategy GetAccessStrategyWithSize(BufferAccessStrategyType btype, int ring_size_kb); extern int GetAccessStrategyBufferCount(BufferAccessStrategy strategy); +extern int GetAccessStrategyPinLimit(BufferAccessStrategy strategy); extern void FreeAccessStrategy(BufferAccessStrategy strategy); -- 2.39.3 (Apple Git-146)
From 1488b8f56b9aa342ab3164135afffe9805e562c6 Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Thu, 4 Apr 2024 21:11:06 +1300 Subject: [PATCH 2/2] Increase default vacuum_buffer_usage_limit to 2MB. The BAS_VACUUM ring size has been 256kB since commit d526575f. Commit 1cbbee03 made it configurable but retained the traditional default. The correct default size has been debated for years, but 256kB is certainly very small. VACUUM soon needs to write back data it dirtied only 32 blocks ago, which usually requires flushing the WAL. New experiments in prefetching pages for VACUUM exacerbated the problem by crashing into dirty data even sooner. Let's make the default 2MB. That's 1.5% of the default toy buffer pool size, and 0.2% of 1GB, which would be a considered a small shared_buffers setting for a real system these days. Users are still free to set the GUC to a different value. Discussion: https://postgr.es/m/20240403221257.md4gfki3z75cdyf6%40awork3.anarazel.de Discussion: https://postgre.es/m/CA%2BhUKGLY4Q4ZY4f1rvnFtv6%2BPkjNf8MejdPkcju3Qii9DYqqcQ%40mail.gmail.com --- doc/src/sgml/config.sgml | 2 +- src/backend/storage/buffer/freelist.c | 2 +- src/backend/utils/misc/guc_tables.c | 2 +- src/backend/utils/misc/postgresql.conf.sample | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 624518e0b0..d8e1282e12 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1961,7 +1961,7 @@ include_dir 'conf.d' valid sizes range from <literal>128 kB</literal> to <literal>16 GB</literal>. If the specified size would exceed 1/8 the size of <varname>shared_buffers</varname>, the size is silently capped - to that value. The default value is <literal>256 kB</literal>. If + to that value. The default value is <literal>2MB</literal>. If this value is specified without units, it is taken as kilobytes. This parameter can be set at any time. It can be overridden for <xref linkend="sql-vacuum"/> and <xref linkend="sql-analyze"/> diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c index c69590d6d8..65a6b3b357 100644 --- a/src/backend/storage/buffer/freelist.c +++ b/src/backend/storage/buffer/freelist.c @@ -561,7 +561,7 @@ GetAccessStrategy(BufferAccessStrategyType btype) ring_size_kb = 16 * 1024; break; case BAS_VACUUM: - ring_size_kb = 256; + ring_size_kb = 2048; break; default: diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c index c12784cbec..7d4e4387cf 100644 --- a/src/backend/utils/misc/guc_tables.c +++ b/src/backend/utils/misc/guc_tables.c @@ -2275,7 +2275,7 @@ struct config_int ConfigureNamesInt[] = GUC_UNIT_KB }, &VacuumBufferUsageLimit, - 256, 0, MAX_BAS_VAC_RING_SIZE_KB, + 2048, 0, MAX_BAS_VAC_RING_SIZE_KB, check_vacuum_buffer_usage_limit, NULL, NULL }, diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index baecde2841..2166ea4a87 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -166,7 +166,7 @@ # mmap # (change requires restart) #min_dynamic_shared_memory = 0MB # (change requires restart) -#vacuum_buffer_usage_limit = 256kB # size of vacuum and analyze buffer access strategy ring; +#vacuum_buffer_usage_limit = 2MB # size of vacuum and analyze buffer access strategy ring; # 0 to disable vacuum buffer access strategy; # range 128kB to 16GB -- 2.39.3 (Apple Git-146)