On Fri, Apr 16, 2021 at 1:16 PM Peter Geoghegan <p...@bowt.ie> wrote: > I'm not sure what to do, though. Both the INDEX_CLEANUP = off case and > the failsafe case are only intended for emergencies. And it's hard to > know what to do in a code path that is designed to rarely or never be > used.
How about just documenting it in comments, as in the attached patch? I tried to address all of the issues with LP_DEAD accounting together. Both the issue raised by Masahiko, and one or two others that were also discussed recently on other threads. They all seem kind of related to me. I didn't address the INDEX_CLEANUP = off case in the comments directly (I just addressed the failsafe case). There is no good reason to think that the situation will resolve with INDEX_CLEANUP = off, so it didn't seem wise to mention it too. But that should still be okay -- INDEX_CLEANUP = off has practically been superseded by the failsafe, since it is much more flexible. And, anybody that uses INDEX_CLEANUP = off cannot expect to never do index cleanup without seriously bad consequences all over the place. -- Peter Geoghegan
From 5d24338e90f0f3eec7134c328d40270f2ddddc50 Mon Sep 17 00:00:00 2001 From: Peter Geoghegan <pg@bowt.ie> Date: Fri, 9 Apr 2021 18:50:25 -0700 Subject: [PATCH] VACUUM accounting: Explain LP_DEAD accounting. --- src/backend/access/heap/vacuumlazy.c | 70 +++++++++++++++++++--------- 1 file changed, 47 insertions(+), 23 deletions(-) diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index 9f9ba5d308..ad37f25e2a 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -686,7 +686,16 @@ heap_vacuum_rel(Relation rel, VacuumParams *params, new_min_multi, false); - /* report results to the stats collector, too */ + /* + * Report results to the stats collector, too. + * + * We deliberately avoid telling the stats collector about LP_DEAD items + * that remain in the table when index/heap vacuuming has been bypassed by + * the failsafe mechanism. Avoid behaving too aggressively once the + * danger of wraparound failure subsides. Autoanalyze should notice any + * remaining LP_DEAD items and schedule an autovacuum if nothing else + * does. + */ pgstat_report_vacuum(RelationGetRelid(rel), rel->rd_rel->relisshared, Max(new_live_tuples, 0), @@ -1334,6 +1343,9 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive) */ lazy_scan_prune(vacrel, buf, blkno, page, vistest, &prunestate); + Assert(!prunestate.all_visible || !prunestate.has_lpdead_items); + Assert(!all_visible_according_to_vm || prunestate.all_visible); + /* Remember the location of the last page with nonremovable tuples */ if (prunestate.hastup) vacrel->nonempty_pages = blkno + 1; @@ -1404,7 +1416,6 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive) * Handle setting visibility map bit based on what the VM said about * the page before pruning started, and using prunestate */ - Assert(!prunestate.all_visible || !prunestate.has_lpdead_items); if (!all_visible_according_to_vm && prunestate.all_visible) { uint8 flags = VISIBILITYMAP_ALL_VISIBLE; @@ -1737,19 +1748,22 @@ retry: /* * LP_DEAD items are processed outside of the loop. * - * Note that we deliberately don't set hastup=true in the case of an - * LP_DEAD item here, which is not how lazy_check_needs_freeze() or + * Our working assumption is that any LP_DEAD items we encounter here + * will become LP_UNUSED inside lazy_vacuum_heap_page() before VACUUM + * finishes. + * + * This is why the number of LP_DEAD items won't be reported to the + * stats collector -- we simply won't leave any behind. (Actually, + * cases where we bypass index vacuuming will in fact leave behind + * LP_DEAD items, but that only happens in special circumstances. We + * avoid trying to compensate for that in our later report to the + * stats collector.) + * + * This is also why we deliberately don't set hastup=true in the case + * of an LP_DEAD item. This is not how lazy_check_needs_freeze() or * count_nondeletable_pages() do it -- they only consider pages empty * when they only have LP_UNUSED items, which is important for * correctness. - * - * Our assumption is that any LP_DEAD items we encounter here will - * become LP_UNUSED inside lazy_vacuum_heap_page() before we actually - * call count_nondeletable_pages(). In any case our opinion of - * whether or not a page 'hastup' (which is how our caller sets its - * vacrel->nonempty_pages value) is inherently race-prone. It must be - * treated as advisory/unreliable, so we might as well be slightly - * optimistic. */ if (ItemIdIsDead(itemid)) { @@ -1901,9 +1915,6 @@ retry: * that will need to be vacuumed in indexes later, or a LP_NORMAL tuple * that remains and needs to be considered for freezing now (LP_UNUSED and * LP_REDIRECT items also remain, but are of no further interest to us). - * - * Add page level counters to caller's counts, and then actually process - * LP_DEAD and LP_NORMAL items. */ vacrel->offnum = InvalidOffsetNumber; @@ -1988,13 +1999,6 @@ retry: } #endif - /* Add page-local counts to whole-VACUUM counts */ - vacrel->tuples_deleted += tuples_deleted; - vacrel->lpdead_items += lpdead_items; - vacrel->new_dead_tuples += new_dead_tuples; - vacrel->num_tuples += num_tuples; - vacrel->live_tuples += live_tuples; - /* * Now save details of the LP_DEAD items from the page in the dead_tuples * array. Also record that page has dead items in per-page prunestate. @@ -2021,6 +2025,13 @@ retry: pgstat_progress_update_param(PROGRESS_VACUUM_NUM_DEAD_TUPLES, dead_tuples->num_tuples); } + + /* Finally, add page-local counts to whole-VACUUM counts */ + vacrel->tuples_deleted += tuples_deleted; + vacrel->lpdead_items += lpdead_items; + vacrel->new_dead_tuples += new_dead_tuples; + vacrel->num_tuples += num_tuples; + vacrel->live_tuples += live_tuples; } /* @@ -2095,6 +2106,12 @@ lazy_vacuum(LVRelState *vacrel, bool onecall) * not exceed 32MB. This limits the risk that we will bypass index * vacuuming again and again until eventually there is a VACUUM whose * dead_tuples space is not CPU cache resident. + * + * We don't take any special steps to remember the LP_DEAD items (such + * as counting them in new_dead_tuples report to the stats collector) + * when the optimization is applied. The tests that we apply should + * avoid any noticeable disagreements between acquire_sample_rows() + * and the accounting we perform in lazy_scan_prune(). */ threshold = (double) vacrel->rel_pages * BYPASS_THRESHOLD_PAGES; do_bypass_optimization = @@ -2146,7 +2163,8 @@ lazy_vacuum(LVRelState *vacrel, bool onecall) } /* - * Forget the now-vacuumed tuples -- just press on + * Forget the LP_DEAD items that we just vacuumed (or just decided to not + * vacuum) */ vacrel->dead_tuples->num_tuples = 0; } @@ -3101,6 +3119,12 @@ lazy_cleanup_one_index(Relation indrel, IndexBulkDeleteResult *istat, * * Also don't attempt it if wraparound failsafe is in effect. It's hard to * predict how long lazy_truncate_heap will take. Don't take any chances. + * There is very little chance of truncation working out when the failsafe is + * in effect in any case. It is very likely that pages that lazy_scan_prune + * thought were !hastup before the failsafe was in effect won't be considered + * !hastup by count_nondeletable_pages now. Note that lazy_scan_prune makes + * the optimistic assumption that any LP_DEAD items it encounters will always + * be LP_UNUSED by the time we're called. * * Also don't attempt it if we are doing early pruning/vacuuming, because a * scan which cannot find a truncated heap page cannot determine that the -- 2.27.0