On Fri, Jul 22, 2022 at 2:11 PM Bruce Momjian <br...@momjian.us> wrote: > I have improved the wording of the last paragraph in this patch.
I think that it would be worth prominently explaining where heap-only tuples get their name from: it comes from the fact there are (by definition) no entries for a heap-only tuple in any index, ever. Indexes are nevertheless capable of locating heap-only tuples during index scans, by dealing with a little additional indirection: they must traverse groups of related tuple versions, all for the same logical row that was HOT updated one or more times -- this group of related tuples is called a HOT chain. This seems like a useful thing to emphasize because it places the emphasis on what *doesn't* happen. Mostly what doesn't happen in indexes. New item identifiers actually *are* needed for heap-only tuples (perhaps we could get away with it, but we don't). However, that doesn't really matter too much in practice. Heap-only tuples can still have their line pointers set to LP_UNUSED directly during pruning, without having to be set to LP_DEAD for a time first (a situation which VACUUM alone can correct by setting the LP_DEAD items to LP_UNUSED during its second heap pass). So heap-only tuples "skip the step" where they have to become LP_DEAD stubs/tombstones. Which is possible precisely because indexes don't need to be considered (they're "heap-only"). I agree that pruning should be discussed here, though -- I wouldn't go as far as treating pruning as 100% unrelated to HOT. Perhaps something along the lines of this works: "It is possible for opportunistic pruning to completely remove all bloat caused by HOT updates (bloat from HOT chains), without leaving any residual garbage that only VACUUM is capable of cleaning up. Pruning a page affected by non-HOT updates or deletes is somewhat less effective, though, because small tombstone items (dead item identifiers) must remain until such time as VACUUM can verify that no remaining index tuples reference the items." Again, the emphasis is on what *doesn't* have to happen because indexes aren't making life hard for us. From the point of view of indexes, ignorance is bliss. The really nice important point about pruning and HOT is that it becomes possible (with care from the DBA and application) to practically eliminate the role of VACUUM. We may not even require a little help from VACUUM, under ideal conditions. -- Peter Geoghegan