On Thu, Jul 8, 2021 at 09:51:47AM -0400, Bruce Momjian wrote: > On Thu, Jul 8, 2021 at 08:11:14AM -0500, Justin Pryzby wrote: > > Also, the pg_upgrade status message still seems to be misplaced: > > > > In 20210706190612.gm22...@telsasoft.com, Justin Pryzby wrote: > > > I re-arranged the pg_upgrade output of that patch: it was in the middle > > > of the > > > two halves: "Setting next transaction ID and epoch for new cluster" > > > > +++ b/src/bin/pg_upgrade/pg_upgrade.c > > @@ -473,6 +473,12 @@ copy_xact_xlog_xid(void) > > "\"%s/pg_resetwal\" -f -x %u \"%s\"", > > new_cluster.bindir, > > old_cluster.controldata.chkpnt_nxtxid, > > new_cluster.pgdata); > > + check_ok(); > > + prep_status("Setting oldest XID for new cluster"); > > + exec_prog(UTILITY_LOG_FILE, NULL, true, true, > > + "\"%s/pg_resetwal\" -f -u %u \"%s\"", > > + new_cluster.bindir, > > old_cluster.controldata.chkpnt_oldstxid, > > + new_cluster.pgdata); > > exec_prog(UTILITY_LOG_FILE, NULL, true, true, > > "\"%s/pg_resetwal\" -f -e %u \"%s\"", > > new_cluster.bindir, > > old_cluster.controldata.chkpnt_nxtepoch, > > Wow, you are 100% correct. Updated patch attached.
OK, I have the patch ready to apply to all supported Postgres versions, and it passes all my cross-version pg_upgrade tests. However, I am now stuck on the commit message text, and I think this is the point Peter Geoghegan was trying to make earlier --- while we know that preserving the oldest xid in pg_control is the right thing to do, and that setting it to the current xid - 2 billion (the old behavior) causes vacuum freeze to run on all tables, but what else does this patch affect? As far as I know, seeing a very low oldest xid causes autovacuum to check all objects and make sure their relfrozenxid is less then autovacuum_freeze_max_age, but isn't that just a check? Would that cause any table scans? I would think not. And would this cause incorrect truncation of pg_xact or fsm or vm files? I would think not too. Even if the old and new cluster had mismatched autovacuum_freeze_max_age values, I don't see how that would cause any corruption either. I could perhaps see corruption happening if pg_control's oldest xid value was closer to the current xid value than it should be, but I can't see how having it 2-billion away could cause harm, unless perhaps pg_upgrade itself used enough xids to cause the counter to wrap more than 2^31 away from the oldest xid recorded in pg_control. What I am basically asking is how to document this and what it fixes. -- Bruce Momjian <br...@momjian.us> https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.