Bruce, Stephen, etc.: So, I did a test trial of this and it seems like it didn't solve the issue of huge rsyncs.
That is, the only reason to do this whole business via rsync, instead of doing a new basebackup of each replica, is to cut down on data transfer time by not resyncing the data from the old base directory. But in practice, the majority of the database files seem like they get transmitted anyway. Maybe I'm misreading the rsync ouput? Here's the setup: 3 Ubuntu 14.04 servers on AWS (tiny instance) Running PostgreSQL 9.3.5 Set up in cascading replication 108 --> 107 --> 109 The goal was to test this with cascading, but I didn't get that far. I set up a pgbench workload, read-write on the master and read-only on the two replicas, to simulate a load-balanced workload. I was *not* logging hint bits. I then followed this sequence: 1) Install 9.4 packages on all servers. 2) Shut down the master. 3) pg_upgrade the master using --link 4) shut down replica 107 5) rsync the master's $PGDATA from the replica: rsync -aHv --size-only -e ssh --itemize-changes 172.31.4.108:/var/lib/postgresql/ /var/lib/postgresql/ ... and got: .d..t...... 9.4/main/pg_xlog/ >f+++++++++ 9.4/main/pg_xlog/0000000700000001000000CB .d..t...... 9.4/main/pg_xlog/archive_status/ sent 126892 bytes received 408645000 bytes 7640596.11 bytes/sec total size is 671135675 speedup is 1.64 So that's 390MB of data transfer. If I look at the original directory: postgres@paul: du --max-depth=1 -h 4.0K ./.cache 20K ./.ssh 424M ./9.3 4.0K ./.emacs.d 51M ./9.4 56K ./bench 474M . So 390MB were transferred out of a possible 474MB. That certainly seems like we're still transferring the majority of the data, even though I verified that the hard links are being sent as hard links. No? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers