Il 06/10/14 17:50, Robert Haas ha scritto: > On Mon, Oct 6, 2014 at 11:33 AM, Marco Nenciarini > <marco.nenciar...@2ndquadrant.it> wrote: >>> 2. Take a differential backup. In the backup label file, note the LSN >>> of the fullback to which the differential backup is relative, and the >>> newest LSN guaranteed to be present in the differential backup. The >>> actual backup can consist of a series of 20-byte buffer tags, those >>> being the exact set of blocks newer than the base-backup's >>> latest-guaranteed-to-be-present LSN. Each buffer tag is followed by >>> an 8kB block of data. If a relfilenode is truncated or removed, you >>> need some way to indicate that in the backup; e.g. include a buffertag >>> with forknum = -(forknum + 1) and blocknum = the new number of blocks, >>> or InvalidBlockNumber if removed entirely. >> >> To have a working backup you need to ship each block which is newer than >> latest-guaranteed-to-be-present in full backup and not newer than >> latest-guaranteed-to-be-present in the current backup. Also, as a >> further optimization, you can think about not sending the empty space in >> the middle of each page. > > Right. Or compressing the data.
If we want to introduce compression on server side, I think that compressing the whole tar stream would be more effective. > >> My main concern here is about how postgres can remember that a >> relfilenode has been deleted, in order to send the appropriate "deletion >> tag". > > You also need to handle truncation. Yes, of course. The current backup profile contains the file size, and it can be used to truncate the file to the right size. >> IMHO the easiest way is to send the full list of files along the backup >> and let to the client the task to delete unneeded files. The backup >> profile has this purpose. >> >> Moreover, I do not like the idea of using only a stream of block as the >> actual differential backup, for the following reasons: >> >> * AFAIK, with the current infrastructure, you cannot do a backup with a >> block stream only. To have a valid backup you need many files for which >> the concept of LSN doesn't apply. >> >> * I don't like to have all the data from the various >> tablespace/db/whatever all mixed in the same stream. I'd prefer to have >> the blocks saved on a per file basis. > > OK, that makes sense. But you still only need the file list when > sending a differential backup, not when sending a full backup. So > maybe a differential backup looks like this: > > - Ship a table-of-contents file with a list relation files currently > present and the length of each in blocks. Having the size in bytes allow you to use the same format for non-block files. Am I missing any advantage of having the size in blocks over having the size in bytes? Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it
signature.asc
Description: OpenPGP digital signature