On Thu, Apr 16, 2020 at 11:27 AM Andres Freund <and...@anarazel.de> wrote: > Sure, there is some pre-existing wraparound danger for individual > pages. But it's a pretty narrow corner case before INDEX_CLEANUP > off. > > That comment says something about "shared-memory free space map", making > it sound like any crash would loose the page. But it's a normal FSM > these days. Vacuum will insert the deleted page into the free space > map. So either the FSM would need to be corrupted to not find the > inserted page anymore, or the index would need to grow slow enough to > not use a page before the wraparound. And then such wrapped around xids > would exist on individual pages. Not on all deleted pages, like with > INDEX_CLEANUP false.
Is that really that narrow, even without "INDEX_CLEANUP false"? It's not as if the index needs to grow very slowly to have only very few page splits hour to hour (it depends on whether the inserts are random or not, and so on). Especially if you had a bulk DELETE affecting many rows, which is hardly that uncommon. Fundamentally, btvacuumpage() doesn't freeze 32-bit XIDs (from bpto.xact) when it recycles deleted pages. It simply puts them in the FSM without changing anything about the page itself. This means surprisingly little in the context of nbtree: the _bt_page_recyclable() XID check that takes place in btvacuumpage() also takes place in _bt_getbuf(), at the point where the page actually gets recycled by the client. That's not great. It wouldn't be so unreasonable if btvacuumpage() actually did freeze the bpto.xact value at the point where it puts the page in the FSM. It doesn't need to be crash safe; it can work as a hint. Maybe "freezing" is the wrong word (too much baggage). More like we'd have VACUUM represent that "this deleted B-Tree page is definitely not considered to still be a part of the tree by any possible other backend" using a page flag hint -- btvacuumpage() would "mark the deleted page as recyclable" explicitly. Note that we still need to keep the original bpto.xact XID around for _bt_log_reuse_page() (also, do we need to worry _bt_log_reuse_page() with a wrapped-around XID?). -- Peter Geoghegan