On 1/26/20 10:44 PM, Andreas Joseph Krogh wrote:
På mandag 27. januar 2020 kl. 03:26:59, skrev Ron <ronljohnso...@gmail.com <mailto:ronljohnso...@gmail.com>>:

    [..]
    I ran *uncompressed* pg_dump on multiple TB+ sized databases from v8.4
    servers across the LAN using 9.6 binaries on the remote server.  It
    was quite fast.  Threading was key.

According to the manual: https://www.postgresql.org/docs/12/app-pgdump.html
the "directory format" is the only format which supports parallel dumps, if I'm not reading it wrong. How did threading solve "between database" dump/restore for you? Did you dump to "directory format" first, then restore?

Yes.

If so, then that requires quite a bit of temp-space...

Correct. The databases are mostly compressed TIFF and PDF images in bytea fields, so having Postgres try and compress them again was *slow* and used a *lot* of CPU.  Thus, I did uncompressed backups, and that took a *lot* of scratch disk space.

(We were not only upgrading Postgres 8.4 to 9.6, but also RHEL 5.10 to 6.10, and moving to a geographically distant data center.  Thus, I deemed pg_upgrade to be impractical.)

We spun up some VMs with 10 total TB in the same DC as the source (physical) servers, and I installed Pg 9.6 on these "intermediate servers", and did remote pg_dumps of the 8.4 servers.  Then I installed 9.6 on the VMs in the new DC, and NFS mounted the intermediate servers' volumes and ran multi-threaded pg_restore on the new servers.  They pulled the data across the WAN.

--
Angular momentum makes the world go 'round.

Reply via email to