On Wed, Nov 9, 2022 at 5:46 PM Andres Freund <and...@anarazel.de> wrote: > > Putting all 3 together: doesn't it seem quite likely that the way that > > we compute OldestXmin is the factor that prevents "skewering" of an > > update chain? What else could possibly be preventing corruption here? > > (Theoretically it might never have been discovered, but that seems > > pretty hard to believe.) > > I don't see how that follows. The existing code is just ok with that.
My remarks about "3 facts we agree on" were not intended to be a watertight argument. More like: what else could it possibly be that prevents problems in practice, if not *something* to do with how we compute OldestXmin? Leaving aside the specifics of how OldestXmin is computed for a moment: what alternative explanation is even remotely plausible? There just aren't that many moving parts involved here. The idea that we can ever freeze the xmin of a successor tuple/version from an update chain without also pruning away earlier versions of the same chain is wildly implausible. It sounds totally contradictory. > In fact > we have explicit code trying to exploit this: > > /* > * If the DEAD tuple is at the end of the chain, the entire > chain is > * dead and the root line pointer can be marked dead. > Otherwise just > * redirect the root to the correct chain member. > */ > if (i >= nchain) > heap_prune_record_dead(prstate, rootoffnum); > else > heap_prune_record_redirect(prstate, rootoffnum, > chainitems[i]); I don't see why this code is relevant. -- Peter Geoghegan