On Sun, Nov 17, 2024 at 01:50:53PM -0500, Greg Sabino Mullane wrote: > On Wed, Nov 6, 2024 at 5:07 PM Nathan Bossart <nathandboss...@gmail.com> > wrote: >> Therefore, it can be much faster to instead move the entire data directory >> from the old cluster >> to the new cluster and to then swap the catalog relation files. > > Thank you for breaking this up so clearly into separate commits. I think it > is a very interesting idea, and anything to speed up pg_upgrade is always > welcome. Some minor thoughts:
Thank you for reviewing! >> .. we don't really expect there to be directories within database > directories, >> so perhaps it would be better to either unconditionally rename or to fail. > Failure seems the best option here, so we can cleanly handle any future > cases in which we decide to put dirs in this directory. Good point. >> if (RelFileNumberIsValid(rfn)) >> { >> FileNameMap key; >> >> key.relfilenumber = (RelFileNumber) rfn; >> if (bsearch(&key, context->maps, context->size, >> sizeof(FileNameMap), FileNameMapCmp)) >> return 0; >> } >> >> snprintf(dst, sizeof(dst), "%s/%s", context->target, filename); >> if (rename(fname, dst) != 0) > > I'm not quite clear what we are doing here with falling through > for InvalidOid entries, could you explain? The idea is that if it looks like a data file that we might want to transfer (i.e., it starts with a RelFileNumber), we should consult our map to determine whether to move it. Otherwise, we want to unconditionally transfer it so that we always use the files generated during pg_restore in the new cluster (e.g., PG_VERSION and pg_filenode.map). In theory, this should result in the same end state as what --link mode does today (for the new cluster, at least). >> .. vm_must_add_frozenbit isn't handled yet. We could either disallow >> using catalog-swap mode if the upgrade involves versions older than v9.6 > > Yes, this. No need for more code to handle super old versions when other > options exist. I'm inclined to agree. >> with this problem is to introduce a special mode for "initdb --sync-only" >> that calls fsync() for everything _except_ the actual data files. If we >> fsync() the new catalog files as we move them into place, and if we assume >> that the old catalog files will have been properly synchronized before >> upgrading, there's no reason to synchronize them again at the end. > > Very cool approach! :) -- nathan