Re: [HACKERS] pg_upgrade and rsync

Stephen Frost Tue, 27 Jan 2015 07:53:57 -0800

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Robert Haas <robertmh...@gmail.com> writes:
> > On Tue, Jan 27, 2015 at 9:50 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> >> That's certainly impossible for the system catalogs, which means you
> >> have to be able to deal with relfilenode discrepancies for them, which
> >> means that maintaining the same relfilenodes for user tables is of
> >> dubious value.
> 
> > Why is that impossible for the system catalogs?
> 
> New versions aren't guaranteed to have the same system catalogs, let alone
> the same relfilenodes for them.


Indeed, new versions almost certainly have wholly new system catalogs.

While there might be a reason to keep the relfilenodes the same, it
doesn't actually help with the pg_upgrade use-case we're currently
discussing (at least, not without additional help).  The problem is that
we certainly must transfer all the new catalogs, but how would rsync
know that those catalog files have to be transferred but not the user
relations?  Using --size-only would mean that system catalogs whose
sizes happen to match after the upgrade wouldn't be transferred and that
would certainly lead to a corrupt situation.

Andres proposed a helper script which would go through the entire tree
on the remote side and set all the timestamps on the remote side to
match those on the local side (prior to the pg_upgrade).  If all the
relfilenodes remained the same and the timestamps on the catalog tables
all changed then it might work to do (without using --size-only):

stop-cluster
set-timestamp-script
pg_upgrade
rsync new_data_dir -> remote:existing_cluster

This would mean that any other files which happened to be changed by
pg_upgrade beyond the catalog tables would also get copied across.  The
issue that I see with that is that if the pg_upgrade process does touch
anything outside of the system catalogs, then its documented revert
mechanism (rename the control file and start the old cluster back up,
prior to having started the new cluster) wouldn't be valid.  Requiring
an extra script which runs around changing timestamps on files is a bit
awkward too, though I suppose possible, and then we'd also have to
document that this process only works with $version of pg_upgrade that
does the preservation of the relfilenodes.

I suppose there's also technically a race condition to consider, if the
whole thing is scripted and pg_upgrade manages to change an existing
file in the same second that the old cluster did then that file wouldn't
be recognized by the rsync as having been updated.  That's not too hard
to address though- just wait a second somewhere in there.  Still, I'm
not really sure that this approach really gains us much over the
approach that Bruce is proposing.

        Thanks,

                Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] pg_upgrade and rsync

Reply via email to