Hi, The use of the ringbuffer in VACUUM often causes very substantial slowdowns.
The primary reason for that is that often most/all the buffers in the ringbuffer have been dirtied when they were processed, with an associated WAL record. When we then reuse the buffer via the (quite small) ringbuffer, we need to flush the WAL before reclaiming the buffer. A synchronous flush of the WAL every few buffers ends up as a very significant bottleneck, unless you have local low-latency durable storage (i.e. local storage with some sort of power protection). The slowdown caused by the frequent WAL flushes is very significant. A secondary issue, when we end up doing multiple passes, we'll have to re-read data into shared_buffers, when we just wrote it out and evicted it. An example: On a local SSD with decently fast fdatasync([1]). Table size is 3322MB, with ~10% updated, ~30% deleted tuples, and a single index. m_w_m is large enough to do this in one pass. I used pg_prewarm() of another relation to ensure the vacuumed table isn't in s_b (otherwise ringbuffers aren't doing anything). s_b ringbuffer enabled time wal_syncs wal_sync_time 128MB 1 77797ms 128MB 0 13676ms 241 2538ms 8GB 1 72834ms 23976 51989ms 8GB 0 9544ms 150 1634ms see [2] for logs / stats of the 8GB run. All the data here is in the OS page cache, so we don't even pay the real-price for reading the data multiple times. On cloud hardware with higher fsync latency I've seen > 15x time differences between using the ringbuffers and avoiding them by using pg_prewarm. Of course there's a good reason we have the ringbuffer - we don't want maintenance operations to completely disturb the buffer pool and harm the production workload. But if, e.g., the database isn't available due to anti-wraparound measures, there's no other workload to protect, and the ringbuffer substantially reduces availability. Initial data loads could similarly benefit. Therefore I'd like to add an option to the VACUUM command to use to disable the use of the ringbuffer. Not sure about the name yet. I think we should auto-enable that mode once we're using the failsafe mode, similar to [auto]vacuum cost delays getting disabled (c.f. lazy_check_wraparound_failsafe()). If things are bad enough that we're soon going to shut down, we want to be aggressive. Greetings, Andres Freund [1] according to pg_test_fsync: fdatasync 769.189 ops/sec 1300 usecs/op [2] For the s_b 128MB case: ringbuffers enabled: 2023-01-11 10:24:58.726 PST [355353][client backend][2/19:0][psql] INFO: aggressively vacuuming "postgres.public.copytest_0" 2023-01-11 10:26:19.488 PST [355353][client backend][2/19:0][psql] INFO: finished vacuuming "postgres.public.copytest_0": index scans: 1 pages: 0 removed, 424975 remain, 424975 scanned (100.00% of total) tuples: 4333300 removed, 6666700 remain, 0 are dead but not yet removable removable cutoff: 2981, which was 0 XIDs old when operation ended new relfrozenxid: 2981, which is 102 XIDs ahead of previous value frozen: 424975 pages from table (100.00% of total) had 6666700 tuples frozen index scan needed: 424975 pages from table (100.00% of total) had 4325101 dead item identifiers removed index "copytest_0_id_idx": pages: 10032 in total, 0 newly deleted, 0 currently deleted, 0 reusable I/O timings: read: 2284.292 ms, write: 4325.009 ms avg read rate: 83.203 MB/s, avg write rate: 83.199 MB/s buffer usage: 425044 hits, 860113 misses, 860074 dirtied WAL usage: 1709902 records, 434990 full page images, 2273501683 bytes system usage: CPU: user: 11.62 s, system: 11.86 s, elapsed: 80.76 s ┌─────────────┬─────────┬────────────┬──────────────────┬───────────┬──────────┬────────────────┬───────────────┬───────────────────────────────┐ │ wal_records │ wal_fpi │ wal_bytes │ wal_buffers_full │ wal_write │ wal_sync │ wal_write_time │ wal_sync_time │ stats_reset │ ├─────────────┼─────────┼────────────┼──────────────────┼───────────┼──────────┼────────────────┼───────────────┼───────────────────────────────┤ │ 8092795 │ 443356 │ 2999358740 │ 1569 │ 28651 │ 27081 │ 1874.391 │ 59895.674 │ 2023-01-11 10:24:58.664859-08 │ └─────────────┴─────────┴────────────┴──────────────────┴───────────┴──────────┴────────────────┴───────────────┴───────────────────────────────┘ ringbuffers disabled: 2023-01-11 10:23:05.081 PST [355054][client backend][2/19:0][psql] INFO: aggressively vacuuming "postgres.public.copytest_0" 2023-01-11 10:23:18.755 PST [355054][client backend][2/19:0][psql] INFO: finished vacuuming "postgres.public.copytest_0": index scans: 1 pages: 0 removed, 424979 remain, 424979 scanned (100.00% of total) tuples: 4333300 removed, 6666700 remain, 0 are dead but not yet removable removable cutoff: 2879, which was 0 XIDs old when operation ended new relfrozenxid: 2879, which is 102 XIDs ahead of previous value frozen: 424979 pages from table (100.00% of total) had 6666700 tuples frozen index scan needed: 424979 pages from table (100.00% of total) had 4325176 dead item identifiers removed index "copytest_0_id_idx": pages: 10032 in total, 0 newly deleted, 0 currently deleted, 0 reusable I/O timings: read: 1247.366 ms, write: 2888.756 ms avg read rate: 491.485 MB/s, avg write rate: 491.395 MB/s buffer usage: 424927 hits, 860242 misses, 860083 dirtied WAL usage: 1709918 records, 434994 full page images, 2273503049 bytes system usage: CPU: user: 5.42 s, system: 6.26 s, elapsed: 13.67 s ┌─────────────┬─────────┬────────────┬──────────────────┬───────────┬──────────┬────────────────┬───────────────┬───────────────────────────────┐ │ wal_records │ wal_fpi │ wal_bytes │ wal_buffers_full │ wal_write │ wal_sync │ wal_write_time │ wal_sync_time │ stats_reset │ ├─────────────┼─────────┼────────────┼──────────────────┼───────────┼──────────┼────────────────┼───────────────┼───────────────────────────────┤ │ 8092963 │ 443362 │ 2999373996 │ 212190 │ 212333 │ 241 │ 1209.516 │ 2538.706 │ 2023-01-11 10:23:05.004783-08 │ └─────────────┴─────────┴────────────┴──────────────────┴───────────┴──────────┴────────────────┴───────────────┴───────────────────────────────┘ For the s_b 8GB case: ringbuffers enabled: 2023-01-11 10:04:12.479 PST [352665][client backend][2/19:0][psql] INFO: aggressively vacuuming "postgres.public.copytest_0" 2023-01-11 10:05:25.312 PST [352665][client backend][2/19:0][psql] INFO: finished vacuuming "postgres.public.copytest_0": index scans: 1 pages: 0 removed, 424977 remain, 424977 scanned (100.00% of total) tuples: 4333300 removed, 6666700 remain, 0 are dead but not yet removable removable cutoff: 2675, which was 0 XIDs old when operation ended new relfrozenxid: 2675, which is 102 XIDs ahead of previous value frozen: 424977 pages from table (100.00% of total) had 6666700 tuples frozen index scan needed: 424977 pages from table (100.00% of total) had 4325066 dead item identifiers removed index "copytest_0_id_idx": pages: 10032 in total, 0 newly deleted, 0 currently deleted, 0 reusable I/O timings: read: 2610.875 ms, write: 4177.842 ms avg read rate: 81.688 MB/s, avg write rate: 87.515 MB/s buffer usage: 523611 hits, 761552 misses, 815868 dirtied WAL usage: 1709910 records, 434992 full page images, 2273502729 bytes system usage: CPU: user: 11.00 s, system: 11.86 s, elapsed: 72.83 s ┌─────────────┬─────────┬────────────┬──────────────────┬───────────┬──────────┬────────────────┬───────────────┬───────────────────────────────┐ │ wal_records │ wal_fpi │ wal_bytes │ wal_buffers_full │ wal_write │ wal_sync │ wal_write_time │ wal_sync_time │ stats_reset │ ├─────────────┼─────────┼────────────┼──────────────────┼───────────┼──────────┼────────────────┼───────────────┼───────────────────────────────┤ │ 8092707 │ 443358 │ 2999354090 │ 42259 │ 66227 │ 23976 │ 2050.963 │ 51989.099 │ 2023-01-11 10:04:12.404054-08 │ └─────────────┴─────────┴────────────┴──────────────────┴───────────┴──────────┴────────────────┴───────────────┴───────────────────────────────┘ ringbuffers disabled: 2023-01-11 10:08:48.414 PST [353287][client backend][3/19:0][psql] INFO: aggressively vacuuming "postgres.public.copytest_0" 2023-01-11 10:08:57.956 PST [353287][client backend][3/19:0][psql] INFO: finished vacuuming "postgres.public.copytest_0": index scans: 1 pages: 0 removed, 424977 remain, 424977 scanned (100.00% of total) tuples: 4333300 removed, 6666700 remain, 0 are dead but not yet removable removable cutoff: 2777, which was 0 XIDs old when operation ended new relfrozenxid: 2777, which is 102 XIDs ahead of previous value frozen: 424977 pages from table (100.00% of total) had 6666700 tuples frozen index scan needed: 424976 pages from table (100.00% of total) had 4325153 dead item identifiers removed index "copytest_0_id_idx": pages: 10032 in total, 0 newly deleted, 0 currently deleted, 0 reusable I/O timings: read: 1040.230 ms, write: 0.000 ms avg read rate: 312.016 MB/s, avg write rate: 356.242 MB/s buffer usage: 904078 hits, 381084 misses, 435101 dirtied WAL usage: 1709908 records, 434992 full page images, 2273499663 bytes system usage: CPU: user: 5.57 s, system: 2.26 s, elapsed: 9.54 s ┌─────────────┬─────────┬────────────┬──────────────────┬───────────┬──────────┬────────────────┬───────────────┬───────────────────────────────┐ │ wal_records │ wal_fpi │ wal_bytes │ wal_buffers_full │ wal_write │ wal_sync │ wal_write_time │ wal_sync_time │ stats_reset │ ├─────────────┼─────────┼────────────┼──────────────────┼───────────┼──────────┼────────────────┼───────────────┼───────────────────────────────┤ │ 8092933 │ 443358 │ 2999364596 │ 236354 │ 236398 │ 150 │ 1166.314 │ 1634.408 │ 2023-01-11 10:08:48.350328-08 │ └─────────────┴─────────┴────────────┴──────────────────┴───────────┴──────────┴────────────────┴───────────────┴───────────────────────────────┘