Commit 2fd8685e7f simplified the checking of modified attributes that takes place within heap_update(). This included a micro-optimization that affects pages marked PageIsFull(): when the target page is marked with PD_PAGE_FULL (which must have been set by a previous heap_update call), don't even try to use HOT -- assume that we have no chance in order to save a few cycles on determining HOT safety.
I doubt that this micro-optimization actually pays for itself, though. Plus heap_update() is very complicated; do we really need to keep this special case? The benefit is that we avoid work that we do ~99% of the time anyway. Attached patch removes the micro-optimization entirely. This isn't just refactoring work -- at least to me. I'm also concerned that this test unnecessarily prevents HOT updates with certain workloads. There is a reasonable chance that the last updater couldn't fit a new version on the same heap page (and so marked the page PD_PAGE_FULL) at a point where the page didn't quite have enough free space for *their* new tuple, while *almost* having enough space. And so it's worth being open to the possibility that our own heap_update() call has a smaller tuple than the first updater, perhaps only by chance (or perhaps because the original updater couldn't use HOT specifically because their new tuple was unusually large). Not all tables naturally have equisized rows, obviously. And even when they do we should still expect TOAST compression to create variation in the size of physical heap tuples (some toastable attributes have more entropy than others, making them less effective targets for compression, etc). It's not just variability in the size of heap tuples. Comments describing the micro-optimization claim that there is no chance of cleanup happening concurrently, so that can't be a factor. But that's really not true anymore. While it is still true that heap_update holds a pin on the original page, blocking concurrent pruning (e.g., while it waits for a tuple heavyweight lock), that in itself doesn't mean that nobody else can free up space when heap_update() drops its buffer lock -- things have changed. Commit 8523492d4e taught VACUUM to set LP_DEAD line pointers to LP_UNUSED, while only holding an exclusive lock (not a super-exclusive/cleanup lock) on the target heap page/buffer. That's enough to allow concurrent processing by VACUUM to go ahead (excluding pruning). And so PageGetHeapFreeSpace() can go from indicating that the page has 0 space to more than enough space, due only to concurrent activity by VACUUM (a pin won't prevent that anymore). This is not especially unlikely with a small table. I think that it's possible that Tom would have found it easier to debug an issue that led to a PANIC inside heap_update() earlier this year (see commit 34f581c39e). That bug was judged to be an old bug in heap_update(), but we only started to see PANICs when the aforementioned enhancement to VACUUM went in. -- Peter Geoghegan
v1-0001-Don-t-avoid-applying-HOT-when-page-is-marked-full.patch
Description: Binary data