On Thu, Dec 5, 2019 at 9:48 AM Craig Jackson <craig.jack...@broadcom.com>
wrote:

> Hi,
>
> We are in the process of migrating an oracle database to postgres in
> Google Cloud and are investigating backup/recovery tools. The database is
> size is > 20TB. We have an SLA that requires us to be able to complete a
> full restore of the database within 24 hours. We have been testing
> pgbackreset, barman, and GCP snapshots but wanted to see if there are any
> other recommendations we should consider.
>
> *Desirable features*
> - Parallel backup/recovery
> - Incremental backups
> - Backup directly to a GCP bucket
> - Deduplication/Compression
>

For your 24-hour-restore requirement, there's an additional feature you
might consider: incremental restore, or what you might call "recovery in
place"; that is, the ability to keep a more-or-less up-to-date copy, and
then in an emergency only restore the diffs on the file system. pgbackup
uses a built-in rsync-like feature, plus a client-server architecture, that
allows it to quickly determine which disk blocks need to be updated.
Checksums are computed on each side, and data are only transferred if
checksums differ. It's very efficient. I assume that a 20 TB database is
mostly static, with only a small fraction of the data updated in any month.
I believe the checksums are precomputed and stored in the pgbackrest
repository, so you can even do this from an Amazon S3 (or whatever Google's
Cloud equivalent is for low-cost storage) backup with just modest bandwidth
usage.

In a cloud environment, you can do this on modestly-priced hardware (a few
CPUs, modest memory). In the event of a failover, unmount your backup disk,
spin up a big server, mount the database, do the incremental restore, and
you're in business.

Craig (James)


> Any suggestions would be appreciated.
>
> Craig Jackson
>

Reply via email to