On Wed, Apr 29, 2020 at 1:40 PM Peter Geoghegan <p...@bowt.ie> wrote: > I'm not sure how low the costs would be, but at least we'd only have > to do it once per already-deleted page (i.e. only the first time a > VACUUM operation found _bt_page_eligible_for_recycling() returned true > for the page and marked it recycled in a crash safe manner). That > design would be quite a lot simpler, because it expresses the problem > in terms that make sense to the nbtree code. _bt_getbuf() should not > have to make a distinction between "possibly recycled" versus > "definitely recycled".
As a bonus, we now have an easy correctness cross-check: if some random piece of nbtree code lands on a page (having followed a downlink or sibling link) that is marked recycled, then clearly something is very wrong -- throw a "can't happen" error. This would be especially useful in places like _bt_readpage(), I suppose. Think of extreme cases like cursors, which can have a scan that remembers a block number of a leaf page, that only actually gets accessed hours or days later (for whatever reason). If that code was buggy in some way, we might have a hope of figuring it out at some point with this cross-check. -- Peter Geoghegan