Re: storing an explicit nonce

Bruce Momjian Tue, 01 Jun 2021 07:10:14 -0700

On Mon, May 31, 2021 at 04:16:52PM -0400, Stephen Frost wrote:
> Greetings,
> 
> * Andres Freund (and...@anarazel.de) wrote:
> > On 2021-05-27 17:00:23 -0400, Bruce Momjian wrote:
> > > If you go in that direction, you should make sure pg_upgrade preserves
> > > what you use (it does not preserve relfilenode, just pg_class.oid)
> > 
> > Is there a reason for pg_upgrade not to maintain relfilenode, aside from
> > implementation simplicity (which is a good reason!). The fact that the old 
> > and
> > new clusters have different relfilenodes does make inspecting some things a
> > bit harder.
> 
> This was discussed for a bit during the Unconference (though it was
> related to backups and major upgrades which involves replicas) and the
> general consensus seemed to be that, no, it wasn't for any specific
> reason beyond that pg_upgrade didn't need to preserve relfilenode and
> therefore didn't.


Yes, David Steele wanted it so incremental backups after pg_upgrade were
smaller, which makes sense.

> There was a discussion around if there were possibly any pitfalls that
> we might run into, should we try to have pg_upgrade preserve
> relfilenodes but I don't *think* there were any actual show stoppers
> that came up.  The simplest approach, I would think, would be to have it
> do the same thing that it does for OIDs today- basically have pg_dump in
> binary mode emit a function call to inform the backend of what
> relfilenode to use for the next CREATE statement.  We would need to also
> pass into that function if the table should have a TOAST table and what
> the relfilenode for that should be too, for the base table.  We'd need
> to also handle indexes, mat views, etc, of course.

Yes, exactly.  The pg_upgrade.c paragraph says:

         *  We control all assignments of pg_class.oid (and relfilenode) so 
toast
         *  oids are the same between old and new clusters.  This is important
         *  because toast oids are stored as toast pointers in user tables.
         *
         *  While pg_class.oid and pg_class.relfilenode are initially the same
         *  in a cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM
         *  FULL.  In the new cluster, pg_class.oid and pg_class.relfilenode 
will
         *  be the same and will match the old pg_class.oid value.  Because of
         *  this, old/new pg_class.relfilenode values will not match if CLUSTER,
         *  REINDEX, or VACUUM FULL have been performed in the old cluster.

One tricky case is pg_largeobject, which is copied from the old to new
cluster since it has user data.  To preserve that relfilenode, you would
need to have pg_upgrade perform cluster surgery in each database to
renumber its relfilenode to match since it is created by initdb.  I
can't think of a case where pg_upgrade already does something like that.

-- 
  Bruce Momjian  <br...@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: storing an explicit nonce

Reply via email to