On Fri, Nov 12, 2021 at 10:45 AM Peter Geoghegan <p...@bowt.ie> wrote: > Let's assume that somehow I have it wrong. Even then, why should we > compensate like this for the stats collector, but not for VACUUM? > There's certainly no corresponding code in vacuumlazy.c that does a > similar transformation with ndeleted.
I think that I figured it out myself, but it's very confusing. Could definitely do with a better explanatory comment. Here's what I think is actually going on here: We compensate here precisely because we are not running in VACUUM (it has to be an opportunistic prune in practice). If we're running in VACUUM, then we are justified in ignoring newly-pruned LP_DEAD items, because 1.) They're going to be turned into LP_UNUSED items in the same VACUUM anyway (barring edge-cases), and 2.) Newly-pruned LP_DEAD items are really no different to existing LP_DEAD items that VACUUM found on the page before its own pruning operation even began -- if VACUUM wants to count them or report on them at all, then it had better count them itself, after pruning (which it only began to do in Postgres 14). If we're not running in VACUUM, and have to make a statistics collector call, then we don't want to forget about DEAD tuples that were pruned-away (i.e. no longer have tuple storage) when they still have an LP_DEAD stub item. There is obviously no justification for just ignoring LP_DEAD items there, because we don't know when VACUUM is going to run next (since we are not VACUUM). Does that sound correct? -- Peter Geoghegan