On Tue, Aug 08, 2023 at 01:06:06PM -0700, Nathan Bossart wrote: > I ran a couple of tests for pg_upgrade with 100k tables (created using the > script here [0]) in order to demonstrate the potential benefits of this > patch.
That shows some nice numbers with many files, indeed. How does the size of each file influence the difference in time? + else + { + while (errno = 0, (de = readdir(dir)) != NULL) + { + char subpath[MAXPGPATH * 2]; + + if (strcmp(de->d_name, ".") == 0 || + strcmp(de->d_name, "..") == 0) + continue; It seems to me that there is no need to do that for in-place tablespaces. There are relative paths in pg_tblspc/, so they would be taken care of by the syncfs() done on the main data folder. This does not really check if the mount points of each tablespace is different, as well. For example, imagine that you have two tablespaces within the same disk, syncfs() twice. Perhaps, the current logic is OK anyway as long as the behavior is optional, but it should be explained in the docs, at least. I'm finding a bit confusing that fsync_pgdata() is coded in such a way that it does a silent fallback to the cascading syncs through walkdir() when syncfs is specified but not available in the build. Perhaps an error is more helpful because one would then know that they are trying something that's not around? + pg_log_error("could not synchronize file system for file \"%s\": %m", path); + (void) close(fd); + exit(EXIT_FAILURE); walkdir() reports errors and does not consider these fatal. Why the early exit()? I am a bit concerned about the amount of duplication this patch introduces in the docs. Perhaps this had better be moved into a new section of the docs to explain the tradeoffs, with each tool linking to it? Do we actually need --no-sync at all if --sync-method is around? We could have an extra --sync-method=none at option level with --no-sync still around mainly for compatibility? Or perhaps that's just over-designing things? -- Michael
signature.asc
Description: PGP signature