On 1/26/20 10:44 PM, Andreas Joseph Krogh wrote:
På mandag 27. januar 2020 kl. 03:26:59, skrev Ron <ronljohnso...@gmail.com
<mailto:ronljohnso...@gmail.com>>:
[..]
I ran *uncompressed* pg_dump on multiple TB+ sized databases from v8.4
servers across the LAN using 9.6 binaries on the remote server. It
was quite fast. Threading was key.
According to the manual: https://www.postgresql.org/docs/12/app-pgdump.html
the "directory format" is the only format which supports parallel dumps,
if I'm not reading it wrong.
How did threading solve "between database" dump/restore for you? Did you
dump to "directory format" first, then restore?
Yes.
If so, then that requires quite a bit of temp-space...
Correct. The databases are mostly compressed TIFF and PDF images in bytea
fields, so having Postgres try and compress them again was *slow* and used a
*lot* of CPU. Thus, I did uncompressed backups, and that took a *lot* of
scratch disk space.
(We were not only upgrading Postgres 8.4 to 9.6, but also RHEL 5.10 to 6.10,
and moving to a geographically distant data center. Thus, I deemed
pg_upgrade to be impractical.)
We spun up some VMs with 10 total TB in the same DC as the source (physical)
servers, and I installed Pg 9.6 on these "intermediate servers", and did
remote pg_dumps of the 8.4 servers. Then I installed 9.6 on the VMs in the
new DC, and NFS mounted the intermediate servers' volumes and ran
multi-threaded pg_restore on the new servers. They pulled the data across
the WAN.
--
Angular momentum makes the world go 'round.