On Sun, Jan 1, 2012 at 23:09, Daniel Farina <dan...@heroku.com> wrote: > On Sun, Jan 1, 2012 at 6:13 AM, Magnus Hagander <mag...@hagander.net> wrote: >> It also doesn't affect backups taken through pg_basebackup - but I >> guess you have good reasons for not being able to use that? > > Parallel archiving/de-archiving and segmentation of the backup into > pieces and rate limiting are the most clear gaps. I don't know if > there are performance implications either, but I do pass all my bytes > through unoptimized Python right now -- not exactly a speed demon. > > The approach I use is: > > * Scan the directory tree immediately after pg_start_backup, taking > notes of existent files and sizes > * Split those files into volumes, none of which can exceed 1.5GB. > These volumes are all disjoint > * When creating the tar file, set the header for a tar member to have > as many bytes as recorded in the first pass. If the file has been > truncated, pad with zeros (this is also the behavior of GNU Tar). If > it grew, only read the number of bytes recorded. > * Generate and compress these tar files in parallel > * All the while, the rate of reading files is subject to optional rate > limiting
Well, that certainly goes to enough detail to agree that no, that can't be done with only minor modifications to pg_basebackup. Nor could it be done with your python program talking directly to the walsender backend and get around it that way. But you probably already considered that :D > As important is the fact that each volume can be downloaded and > decompressed in a pipeline (no on-disk transformations to de-archive) > with a tunable amount of concurrency, as all that tar files do not > overlap for any file, and no file needs to span two tar files thanks > to Postgres's refusal to deal in files too large for old platforms. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers