Hi! > 4 апр. 2019 г., в 20:15, Heikki Linnakangas <hlinn...@iki.fi> написал(а): > > On 25/03/2019 15:20, Heikki Linnakangas wrote: >> On 24/03/2019 18:50, Andrey Borodin wrote: >>> I was working on new version of gist check in amcheck and understand one >>> more thing: >>> >>> /* Can this page be recycled yet? */ >>> bool >>> gistPageRecyclable(Page page) >>> { >>> return PageIsNew(page) || >>> (GistPageIsDeleted(page) && >>> TransactionIdPrecedes(GistPageGetDeleteXid(page), >>> RecentGlobalXmin)); >>> } >>> >>> Here RecentGlobalXmin can wraparound and page will become unrecyclable for >>> half of xid cycle. Can we prevent it by resetting PageDeleteXid to >>> InvalidTransactionId before doing RecordFreeIndexPage()? >>> (Seems like same applies to GIN...) >> True, and B-tree has the same issue. I thought I saw a comment somewhere >> in the B-tree code about that earlier, but now I can't find it. I >> must've imagined it. >> We could reset it, but that would require dirtying the page. That would >> be just extra I/O overhead, if the page gets reused before XID >> wraparound. We could avoid that if we stored the full XID+epoch, not >> just XID. I think we should do that in GiST, at least, where this is >> new. In the B-tree, it would require some extra code to deal with >> backwards-compatibility, but maybe it would be worth it even there. > > I suggest that we do the attached. It fixes this for GiST. The patch changes > expands the "deletion XID" to 64-bits, and changes where it's stored. Instead > of storing it pd_prune_xid, it's stored in the page contents. Luckily, a > deleted page has no real content.
So, we store full xid right after page header? +static inline void +GistPageSetDeleteXid(Page page, FullTransactionId deletexid) +{ + Assert(PageIsEmpty(page)); + ((PageHeader) page)->pd_lower = MAXALIGN(SizeOfPageHeaderData) + sizeof(FullTransactionId); + + *((FullTransactionId *) PageGetContents(page)) = deletexid; +} Usually we leave one ItemId (located at invalid offset number) untouched. I do not know is it done for a reason or not.... Also, I did not understand this optimization: + /* + * We can skip this if the page was deleted so long ago, that no scan can possibly + * still see it, even in a standby. One measure might be anything older than the + * table's frozen-xid, but we don't have that at hand here. But anything older than + * 2 billion, from the next XID, is surely old enough, because you would hit XID + * wraparound at that point. + */ + nextxid = ReadNextFullTransactionId(); + diff = U64FromFullTransactionId(nextxid) - U64FromFullTransactionId(latestRemovedXid); + if (diff < 0x7fffffff) + return; Standby can be lagging months from primary, and, theoretically, close the gap in one sudden WAL leap... Also, I think, that comparison sign should be >, not <. Best regards, Andrey Borodin.