On Tue, Mar 14, 2017 at 3:10 PM, Peter Geoghegan <p...@bowt.ie> wrote: > We already have BTPageOpaqueData.btpo, a union whose contained type > varies based on the page being dead. We could just do the same with > some other field in that struct, and then store epoch there. Clearly > nobody really cares about most data that remains on the page. Index > scans just need to be able to land on it to determine that it's dead, > and VACUUM needs to be able to determine whether or not there could > possibly be such an index scan at the time it considers recycling..
ISTM that we need all of the fields within BTPageOpaqueData even for dead pages, actually. The left links and right links still need to be sane, and the flag bits are needed. Plus, the field that stores an XID already is clearly necessary. Even if they weren't needed, it would probably still be a good idea to keep them around for forensic purposes. However, the page header field pd_prune_xid is currently unused for indexes, and is the same width as CheckPoint.nextXidEpoch (the extra thing we might want to store -- the epoch). Maybe you could store the epoch within that field when B-Tree VACUUM deletes a page, and then compare that within _bt_page_recyclable(). It would come before the existing XID comparison in that function. One nice thing about this idea is that pd_prune_xid will be all-zero for index pages from the current format, so there is no need to take special care to make sure that databases that have undergone pg_upgrade don't break. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers