Hi Anastasia, On Wed, Jul 10, 2019 at 11:47 PM Anastasia Lubennikova < a.lubennik...@postgrespro.ru> wrote:
> 23.04.2019 14:08, Anastasia Lubennikova wrote: > > I'm volunteering to write a draft patch or, more likely, set of > > patches, which > > will allow us to discuss the subject in more detail. > > And to do that I wish we agree on the API and data format (at least > > broadly). > > Looking forward to hearing your thoughts. > > Though the previous discussion stalled, > I still hope that we could agree on basic points such as a map file > format and protocol extension, > which is necessary to start implementing the feature. > It's great that you too come up with the PoC patch. I didn't look at your changes in much details but we at EnterpriseDB too working on this feature and started implementing it. Attached series of patches I had so far... (which needed further optimization and adjustments though) Here is the overall design (as proposed by Robert) we are trying to implement: 1. Extend the BASE_BACKUP command that can be used with replication connections. Add a new [ LSN 'lsn' ] option. 2. Extend pg_basebackup with a new --lsn=LSN option that causes it to send the option added to the server in #1. Here are the implementation details when we have a valid LSN sendFile() in basebackup.c is the function which mostly does the thing for us. If the filename looks like a relation file, then we'll need to consider sending only a partial file. The way to do that is probably: A. Read the whole file into memory. B. Check the LSN of each block. Build a bitmap indicating which blocks have an LSN greater than or equal to the threshold LSN. C. If more than 90% of the bits in the bitmap are set, send the whole file just as if this were a full backup. This 90% is a constant now; we might make it a GUC later. D. Otherwise, send a file with .partial added to the name. The .partial file contains an indication of which blocks were changed at the beginning, followed by the data blocks. It also includes a checksum/CRC. Currently, a .partial file format looks like: - start with a 4-byte magic number - then store a 4-byte CRC covering the header - then a 4-byte count of the number of blocks included in the file - then the block numbers, each as a 4-byte quantity - then the data blocks We are also working on combining these incremental back-ups with the full backup and for that, we are planning to add a new utility called pg_combinebackup. Will post the details on that later once we have on the same page for taking backup. Thanks -- Jeevan Chalke Technical Architect, Product Development EnterpriseDB Corporation
IncrBackup.tar.gz
Description: application/gzip