Re: optimize file transfer in pg_upgrade

Nathan Bossart Tue, 19 Nov 2024 10:20:20 -0800

On Sun, Nov 17, 2024 at 01:50:53PM -0500, Greg Sabino Mullane wrote:
> On Wed, Nov 6, 2024 at 5:07 PM Nathan Bossart <[email protected]>
> wrote:
>> Therefore, it can be much faster to instead move the entire data directory
>> from the old cluster
>> to the new cluster and to then swap the catalog relation files.
> 
> Thank you for breaking this up so clearly into separate commits. I think it
> is a very interesting idea, and anything to speed up pg_upgrade is always
> welcome. Some minor thoughts:


Thank you for reviewing!

>> .. we don't really expect there to be directories within database
> directories,
>> so perhaps it would be better to either unconditionally rename or to fail.
> Failure seems the best option here, so we can cleanly handle any future
> cases in which we decide to put dirs in this directory.

Good point.

>> if (RelFileNumberIsValid(rfn))
>> {
>>   FileNameMap key;
>>
>>   key.relfilenumber = (RelFileNumber) rfn;
>>   if (bsearch(&key, context->maps, context->size,
>>      sizeof(FileNameMap), FileNameMapCmp))
>>    return 0;
>>  }
>>
>>  snprintf(dst, sizeof(dst), "%s/%s", context->target, filename);
>>  if (rename(fname, dst) != 0)
> 
> I'm not quite clear what we are doing here with falling through
> for InvalidOid entries, could you explain?

The idea is that if it looks like a data file that we might want to
transfer (i.e., it starts with a RelFileNumber), we should consult our map
to determine whether to move it.  Otherwise, we want to unconditionally
transfer it so that we always use the files generated during pg_restore in
the new cluster (e.g., PG_VERSION and pg_filenode.map).  In theory, this
should result in the same end state as what --link mode does today (for the
new cluster, at least).

>> .. vm_must_add_frozenbit isn't handled yet.  We could either disallow
>> using catalog-swap mode if the upgrade involves versions older than v9.6
> 
> Yes, this. No need for more code to handle super old versions when other
> options exist.

I'm inclined to agree.

>> with this problem is to introduce a special mode for "initdb --sync-only"
>> that calls fsync() for everything _except_ the actual data files.  If we
>> fsync() the new catalog files as we move them into place, and if we assume
>> that the old catalog files will have been properly synchronized before
>> upgrading, there's no reason to synchronize them again at the end.
> 
>  Very cool approach!

:)

-- 
nathan

Re: optimize file transfer in pg_upgrade

Reply via email to