On Mon, Nov 12, 2012 at 10:29:39AM -0800, Jeff Janes wrote: > On Mon, Nov 5, 2012 at 12:08 PM, Bruce Momjian <br...@momjian.us> wrote: > > Magnus reported that a customer with a million tables was finding > > pg_upgrade slow. I had never considered many table to be a problem, but > > decided to test it. I created a database with 2k tables like this: > > > > CREATE TABLE test1990 (x SERIAL); > > > > Running the git version of pg_upgrade on that took 203 seconds. Using > > synchronous_commit=off dropped the time to 78 seconds. This was tested > > on magnetic disks with a write-through cache. (No change on an SSD with > > a super-capacitor.) > > > > I don't see anything unsafe about having pg_upgrade use > > synchronous_commit=off. I could set it just for the pg_dump reload, but > > it seems safe to just use it always. We don't write to the old cluster, > > and if pg_upgrade fails, you have to re-initdb the new cluster anyway. > > > > Patch attached. I think it should be applied to 9.2 as well. > > Is turning off synchronous_commit enough? What about turning off fsync?
I did some testing with the attached patch on a magnetic disk with no BBU that turns off fsync; I got these results: sync_com=off fsync=off 1 15.90 13.51 1000 26.09 24.56 2000 33.41 31.20 4000 57.39 57.74 8000 102.84 116.28 16000 189.43 207.84 It shows fsync faster for < 4k, and slower for > 4k. Not sure why this is the cause but perhaps the buffering of the fsync is actually faster than doing a no-op fsync. I don't think fsync=off makes sense, except for testing; let me know if I should test something else. > When I'm doing a pg_upgrade with thousands of tables, the shutdown > checkpoint after restoring the dump to the new cluster takes a very > long time, as the writer drains its operation table by opening and > individually fsync-ing thousands of files. This takes about 40 ms per > file, which I assume is a combination of slow lap-top disk drive, and > a strange deal with ext4 which makes fsyncing a recently created file > very slow. But even with faster hdd, this would still be a problem > if it works the same way, with every file needing 4 rotations to be > fsynced and this happens in serial. Is this with the current code that does synchronous_commit=off? If not, can you test to see if this is still a problem? > Worse, the shutdown only waits for the default of 60 seconds for the > shutdown to take place before it throws an error and the entire > pg_upgrade gives up. It seems to me that either the -t setting should > be increased, or should be an option to pg_upgrade. > > My work around was to invoke a system-wide "sync" a couple seconds > after the 'pg_ctl stop' is initiated. Flushing the files wholesale > seems to work to make the checkpoint writer rapidly find it has > nothing to do when it tries to flush them retail. > > Anyway, the reason I think turning fsync off might be reasonable is > that as soon as the new cluster is shut down, pg_upgrade starts > overwriting most of those just-fsynced file with other files from the > old cluster, and AFAICT makes no effort to fsync them. So until there > is a system-wide sync after the pg_upgrade finishes, your new cluster > is already in mortal danger anyway. pg_upgrade does a cluster shutdown before overwriting those files. -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/server.c b/contrib/pg_upgrade/server.c new file mode 100644 index 49d4c8f..01e0dd3 *** a/contrib/pg_upgrade/server.c --- b/contrib/pg_upgrade/server.c *************** start_postmaster(ClusterInfo *cluster) *** 219,225 **** (cluster->controldata.cat_ver >= BINARY_UPGRADE_SERVER_FLAG_CAT_VER) ? " -b" : " -c autovacuum=off -c autovacuum_freeze_max_age=2000000000", ! (cluster == &new_cluster) ? " -c synchronous_commit=off" : "", cluster->pgopts ? cluster->pgopts : "", socket_string); /* --- 219,225 ---- (cluster->controldata.cat_ver >= BINARY_UPGRADE_SERVER_FLAG_CAT_VER) ? " -b" : " -c autovacuum=off -c autovacuum_freeze_max_age=2000000000", ! (cluster == &new_cluster) ? " -c fsync=off" : "", cluster->pgopts ? cluster->pgopts : "", socket_string); /*
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers