Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup

Marco Nenciarini Mon, 06 Oct 2014 09:19:49 -0700

Il 06/10/14 17:50, Robert Haas ha scritto:
> On Mon, Oct 6, 2014 at 11:33 AM, Marco Nenciarini
> <[email protected]> wrote:
>>> 2. Take a differential backup.  In the backup label file, note the LSN
>>> of the fullback to which the differential backup is relative, and the
>>> newest LSN guaranteed to be present in the differential backup.  The
>>> actual backup can consist of a series of 20-byte buffer tags, those
>>> being the exact set of blocks newer than the base-backup's
>>> latest-guaranteed-to-be-present LSN.  Each buffer tag is followed by
>>> an 8kB block of data.  If a relfilenode is truncated or removed, you
>>> need some way to indicate that in the backup; e.g. include a buffertag
>>> with forknum = -(forknum + 1) and blocknum = the new number of blocks,
>>> or InvalidBlockNumber if removed entirely.
>>
>> To have a working backup you need to ship each block which is newer than
>> latest-guaranteed-to-be-present in full backup and not newer than
>> latest-guaranteed-to-be-present in the current backup. Also, as a
>> further optimization, you can think about not sending the empty space in
>> the middle of each page.
> 
> Right.  Or compressing the data.


If we want to introduce compression on server side, I think that
compressing the whole tar stream would be more effective.

> 
>> My main concern here is about how postgres can remember that a
>> relfilenode has been deleted, in order to send the appropriate "deletion
>> tag".
> 
> You also need to handle truncation.

Yes, of course. The current backup profile contains the file size, and
it can be used to truncate the file to the right size.

>> IMHO the easiest way is to send the full list of files along the backup
>> and let to the client the task to delete unneeded files. The backup
>> profile has this purpose.
>>
>> Moreover, I do not like the idea of using only a stream of block as the
>> actual differential backup, for the following reasons:
>>
>> * AFAIK, with the current infrastructure, you cannot do a backup with a
>> block stream only. To have a valid backup you need many files for which
>> the concept of LSN doesn't apply.
>>
>> * I don't like to have all the data from the various
>> tablespace/db/whatever all mixed in the same stream. I'd prefer to have
>> the blocks saved on a per file basis.
> 
> OK, that makes sense.  But you still only need the file list when
> sending a differential backup, not when sending a full backup.  So
> maybe a differential backup looks like this:
> 
> - Ship a table-of-contents file with a list relation files currently
> present and the length of each in blocks.

Having the size in bytes allow you to use the same format for non-block
files. Am I missing any advantage of having the size in blocks over
having the size in bytes?

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
[email protected] | www.2ndQuadrant.it

signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup

Reply via email to