Hello hackers, Here's a set of ideas that I think could get rid of wraparound freezes from the traditional heap, using undo logs technology. They're inspired by things that Heikki has said over various threads, adapted to our proposed undo infrastructure.
1. Don't freeze committed xids by brute force search. Keep using the same tuple header as today, but add a pair of 64 bit FullTrasactionIds to the page header (low fxid, high fxid) so that xids are not ambiguous even after they wrap around. If you ever find yourself updating high fxid to a value that is too far ahead of low fxid, you need to do a micro-freeze of the page, but you were already writing to the page so that's cool. 2. Get rid of aborted xids eagerly, instead of relying on brute force scans to move the horizon. Remove the xid references at rollback time with the undo machinery we've built for zheap. While zheap uses undo records to rollback the effects of a transaction (reversing in-place updates etc), these would be very simple undo records that simply remove item pointers relating to aborted transactions, so their xids vanish from the heap. Now the horizon for oldest aborted xid that you can find anywhere in the system is oldest-xid-having-undo, which is tracked by the undo machinery. You don't need to keep more clog than that AFAIK, other than to support the txid_status() function. 3. Don't freeze multixacts by brute force search. Instead, invent 64 bit multixacts and track (low fmxid, high fmxid) and do micro-freezing on the page when the range would be too wide, as we did in point 1 for xids. 4. Get rid of multixacts eagerly. Move the contents of pg_mutixact/members into undo logs, using the new UNDO_SHARED records that we invented at PGCon[1] for essentially the same purpose in zheap. This is storage that is automatically cleaned up by a "discard worker" when every member of a set of xids is no longer running (and it's a bit like the "TED" storage that Heikki talked about a few years back[2]). Keep pg_multixact/offsets, but change it to contain undo record pointers that point to UNDO_SHARED records holding the members. It is a map of multixact ID -> undo holding the members, and it needs to exist only to preserve the 32 bit size of multixact IDs; it'd be nicer to use the undo rec ptr directly, but the goal in this thought experiment is to make minimal format changes to kill freezing (if you want more drastic changes, see zheap). Now you just have to figure out how to trim pg_multixact/offsets, and I think that could be done periodically by testing the oldest multixact it holds: has the undo record it points to been discarded? If so we can trim this multixact. Finding room for 4 64 bit values on the page header is of course tricky and incompatible with pg_upgrade, and hard to support incrementally. I also don't know exactly at which point you'd consider high fxid in visibility computations, considering that in places where you have a tuple pointer, you can't easily find the high fxid you need. One cute but scary idea is that when you're scanning the heap you'd non-durably clobber xmin and xmax with FrozenTrasactionId if appropriate. [1] https://www.postgresql.org/message-id/ca+hukgkni7eeu4ft71vzccwpeagb2pqoekofjqjavknd577...@mail.gmail.com [2] https://www.postgresql.org/message-id/flat/55511D1F.7050902%40iki.fi -- Thomas Munro https://enterprisedb.com