On Mon, Nov 7, 2022 at 5:20 PM Peter Geoghegan <p...@bowt.ie> wrote: > Hi Hussein, > > Apologies for the very delayed response. I'm aware that you've taken > an interest in this subject as part of your YouTube channel. Thanks > for publicizing the work! > > On Tue, Jul 12, 2022 at 7:14 PM PG Doc comments form > <nore...@postgresql.org> wrote: > > Would be nice to add a note: old tuple versions in the index referencing > the > > same logical row cannot be deleted by bottom up index deletion process > when > > older transactions that might require the old state the row are still > > running > > It's really hard to write documentation for something like this, > because it's difficult to decide what your audience really needs to > know. I agree that it's important to get this specific point across, > though. In fact I thought that I already conveyed the same idea at > this point: > > "All indexes will need a successor physical index tuple that points to > the latest version in the table. Each new tuple within each index will > generally need to coexist with the original “updated” tuple for a > short period of time (typically until shortly after the UPDATE > transaction commits)." > > The implication is that we need the old version to coexist until after > the updater transaction commits and is seen by every possible MVCC > snapshot as having committed -- nobody sees the old version anymore. > Maybe we could augment the existing sentences I have highlighted? > Could it be more explicit? >
I'm having trouble finding any major issues with the present wording. Though it seems to be assuming the reader holds sufficient MVCC knowledge to understand the import of "until shortly after the UPDATE transaction commits". Maybe a bit more explicitness is in order. On the point of "will generally need to coexist" - I don't see why we are being wishy-washy here, though. When updating a row where bottom-up deletion is chosen the most recent tuple cannot be removed to make room for the new tuple; in particular, because the current update may not commit. I'm also not inherently understanding how the bottom-up pass can know a tuple is safe to remove based upon visibility information when that information is not present in the index AND it doesn't rely upon LP_DEAD. A bit nit-picky but I think relevant to the above confusion: "B-Tree indexes incrementally delete" - is it really the index self-modifying or is it an active user session taking some time to perform each pass? Describing it as, say: "The updating session will locate all the logically equivalent tuples (on the same page) via the index and check them for global visibility, removing those that it finds that are both older than the most recent tuple and no longer visible to all other sessions." David J.