Re: visibility map corruption

Bruce Momjian Fri, 23 Jul 2021 17:09:12 -0700

On Thu, Jul  8, 2021 at 09:51:47AM -0400, Bruce Momjian wrote:
> On Thu, Jul  8, 2021 at 08:11:14AM -0500, Justin Pryzby wrote:
> > Also, the pg_upgrade status message still seems to be misplaced:
> > 
> > In [email protected], Justin Pryzby wrote:
> > > I re-arranged the pg_upgrade output of that patch: it was in the middle 
> > > of the
> > > two halves: "Setting next transaction ID and epoch for new cluster"
> > 
> > +++ b/src/bin/pg_upgrade/pg_upgrade.c
> > @@ -473,6 +473,12 @@ copy_xact_xlog_xid(void)
> >                           "\"%s/pg_resetwal\" -f -x %u \"%s\"",
> >                           new_cluster.bindir, 
> > old_cluster.controldata.chkpnt_nxtxid,
> >                           new_cluster.pgdata);
> > +       check_ok();
> > +       prep_status("Setting oldest XID for new cluster");
> > +       exec_prog(UTILITY_LOG_FILE, NULL, true, true,
> > +                         "\"%s/pg_resetwal\" -f -u %u \"%s\"",
> > +                         new_cluster.bindir, 
> > old_cluster.controldata.chkpnt_oldstxid,
> > +                         new_cluster.pgdata);
> >         exec_prog(UTILITY_LOG_FILE, NULL, true, true,
> >                           "\"%s/pg_resetwal\" -f -e %u \"%s\"",
> >                           new_cluster.bindir, 
> > old_cluster.controldata.chkpnt_nxtepoch,
> 
> Wow, you are 100% correct.  Updated patch attached.


OK, I have the patch ready to apply to all supported Postgres versions,
and it passes all my cross-version pg_upgrade tests.

However, I am now stuck on the commit message text, and I think this is
the point Peter Geoghegan was trying to make earlier --- while we know
that preserving the oldest xid in pg_control is the right thing to do,
and that setting it to the current xid - 2 billion (the old behavior)
causes vacuum freeze to run on all tables, but what else does this patch
affect?

As far as I know, seeing a very low oldest xid causes autovacuum to
check all objects and make sure their relfrozenxid is less then
autovacuum_freeze_max_age, but isn't that just a check?  Would that
cause any table scans?  I would think not.  And would this cause
incorrect truncation of pg_xact or fsm or vm files?  I would think not
too.

Even if the old and new cluster had mismatched autovacuum_freeze_max_age
values, I don't see how that would cause any corruption either.

I could perhaps see corruption happening if pg_control's oldest xid
value was closer to the current xid value than it should be, but I can't
see how having it 2-billion away could cause harm, unless perhaps
pg_upgrade itself used enough xids to cause the counter to wrap more
than 2^31 away from the oldest xid recorded in pg_control.

What I am basically asking is how to document this and what it fixes.

-- 
  Bruce Momjian  <[email protected]>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: visibility map corruption

Reply via email to