subject:"\[HACKERS\] Proposal\: Incremental Backup"

Re: [HACKERS] Proposal: Incremental Backup

2014-08-13 Thread Claudio Freire

On Tue, Aug 12, 2014 at 8:26 PM, Stephen Frost wrote: > * Claudio Freire (klaussfre...@gmail.com) wrote: >> I'm not talking about malicious attacks, with big enough data sets, >> checksum collisions are much more likely to happen than with smaller >> ones, and incremental backups are supposed to w

Re: [HACKERS] Proposal: Incremental Backup

2014-08-12 Thread Stephen Frost

Claudio, * Claudio Freire (klaussfre...@gmail.com) wrote: > I'm not talking about malicious attacks, with big enough data sets, > checksum collisions are much more likely to happen than with smaller > ones, and incremental backups are supposed to work for the big sets. This is an issue when you'r

Re: [HACKERS] Proposal: Incremental Backup

2014-08-12 Thread Fujii Masao

On Wed, Aug 13, 2014 at 12:58 AM, Robert Haas wrote: > On Tue, Aug 12, 2014 at 10:30 AM, Andres Freund > wrote: >>> Still not safe. Checksum collisions do happen, especially in big data sets. >> >> If you use an appropriate algorithm for appropriate amounts of data >> that's not a relevant conce

Re: [HACKERS] Proposal: Incremental Backup

2014-08-12 Thread Robert Haas

On Tue, Aug 12, 2014 at 10:30 AM, Andres Freund wrote: >> Still not safe. Checksum collisions do happen, especially in big data sets. > > If you use an appropriate algorithm for appropriate amounts of data > that's not a relevant concern. You can easily do different checksums for > every 1GB segme

Re: [HACKERS] Proposal: Incremental Backup

2014-08-12 Thread Claudio Freire

On Tue, Aug 12, 2014 at 11:17 AM, Gabriele Bartolini wrote: > > 2014-08-12 15:25 GMT+02:00 Claudio Freire : >> Still not safe. Checksum collisions do happen, especially in big data sets. > > Can I ask you what you are currently using for backing up large data > sets with Postgres? Currently, a ti

Re: [HACKERS] Proposal: Incremental Backup

2014-08-12 Thread Andres Freund

On 2014-08-12 10:25:21 -0300, Claudio Freire wrote: > On Tue, Aug 12, 2014 at 6:41 AM, Marco Nenciarini > wrote: > > To declared two files identical they must have same size, > > same mtime and same *checksum*. > > Still not safe. Checksum collisions do happen, especially in big data sets. If yo

Re: [HACKERS] Proposal: Incremental Backup

2014-08-12 Thread Marco Nenciarini

Il 12/08/14 15:25, Claudio Freire ha scritto: > On Tue, Aug 12, 2014 at 6:41 AM, Marco Nenciarini > wrote: >> To declared two files identical they must have same size, >> same mtime and same *checksum*. > > Still not safe. Checksum collisions do happen, especially in big data sets. > IMHO it is

Re: [HACKERS] Proposal: Incremental Backup

2014-08-12 Thread Gabriele Bartolini

Hi Claudio, 2014-08-12 15:25 GMT+02:00 Claudio Freire : > Still not safe. Checksum collisions do happen, especially in big data sets. Can I ask you what you are currently using for backing up large data sets with Postgres? Thanks, Gabriele -- Sent via pgsql-hackers mailing list (pgsql-hackers

Re: [HACKERS] Proposal: Incremental Backup

2014-08-12 Thread Claudio Freire

On Tue, Aug 12, 2014 at 6:41 AM, Marco Nenciarini wrote: > To declared two files identical they must have same size, > same mtime and same *checksum*. Still not safe. Checksum collisions do happen, especially in big data sets. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.or

Re: [HACKERS] Proposal: Incremental Backup

2014-08-12 Thread Marco Nenciarini

As I already stated, timestamps will be only used to early detect changed files. To declared two files identical they must have same size, same mtime and same *checksum*. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant

Re: [HACKERS] Proposal: Incremental Backup

2014-08-11 Thread Claudio Freire

On Mon, Aug 11, 2014 at 12:27 PM, Robert Haas wrote: > >> As Marco says, that can be optimized using filesystem timestamps instead. > > The idea of using filesystem timestamps gives me the creeps. Those > aren't always very granular, and I don't know that (for example) they > are crash-safe. Doe

Re: [HACKERS] Proposal: Incremental Backup

2014-08-11 Thread Robert Haas

On Tue, Aug 5, 2014 at 8:04 PM, Simon Riggs wrote: > To decide whether we need to re-copy the file, you read the file until > we find a block with a later LSN. If we read the whole file without > finding a later LSN then we don't need to re-copy. That means we read > each file twice, which is slow

Re: [HACKERS] Proposal: Incremental Backup

2014-08-07 Thread Benedikt Grundmann

On Thu, Aug 7, 2014 at 6:29 PM, Gabriele Bartolini < gabriele.bartol...@2ndquadrant.it> wrote: > Hi Marco, > > > With the current full backup procedure they are backed up, so I think > > that having them backed up with a rsync-like algorithm is what an user > > would expect for an incremental back

Re: [HACKERS] Proposal: Incremental Backup

2014-08-07 Thread Gabriele Bartolini

Hi Marco, > With the current full backup procedure they are backed up, so I think > that having them backed up with a rsync-like algorithm is what an user > would expect for an incremental backup. Exactly. I think a simple, flexible and robust method for file based incremental backup is all we ne

Re: [HACKERS] Proposal: Incremental Backup

2014-08-07 Thread Marco Nenciarini

Il 07/08/14 17:25, Bruce Momjian ha scritto: > On Thu, Aug 7, 2014 at 08:35:53PM +0900, Michael Paquier wrote: >> On Thu, Aug 7, 2014 at 8:11 PM, Fujii Masao wrote: >>> There are some data which don't have LSN, for example, postgresql.conf. >>> When such data has been modified since last backup,

Re: [HACKERS] Proposal: Incremental Backup

2014-08-07 Thread Marco Nenciarini

Il 07/08/14 17:29, Bruce Momjian ha scritto: > I am a little worried that many users will not realize this until they > try it and are disappointed, e.g. "Why is PG writing to my static data > so often?" --- then we get beaten up about our hint bits and freezing > behavior. :-( > > I am just tryi

Re: [HACKERS] Proposal: Incremental Backup

2014-08-07 Thread Bruce Momjian

On Thu, Aug 7, 2014 at 11:03:40AM +0100, Simon Riggs wrote: > Well, there is a huge difference between file-level and block-level backup. > > Designing, writing and verifying block-level backup to the point that > it is acceptable is a huge effort. (Plus, I don't think accumulating > block number

Re: [HACKERS] Proposal: Incremental Backup

2014-08-07 Thread Bruce Momjian

On Thu, Aug 7, 2014 at 08:35:53PM +0900, Michael Paquier wrote: > On Thu, Aug 7, 2014 at 8:11 PM, Fujii Masao wrote: > > There are some data which don't have LSN, for example, postgresql.conf. > > When such data has been modified since last backup, they also need to > > be included in incremental

Re: [HACKERS] Proposal: Incremental Backup

2014-08-07 Thread Michael Paquier

On Thu, Aug 7, 2014 at 8:11 PM, Fujii Masao wrote: > There are some data which don't have LSN, for example, postgresql.conf. > When such data has been modified since last backup, they also need to > be included in incremental backup? Probably yes. Definitely yes. That's as well the case of paths l

Re: [HACKERS] Proposal: Incremental Backup

2014-08-07 Thread Fujii Masao

On Thu, Aug 7, 2014 at 12:20 AM, Bruce Momjian wrote: > On Wed, Aug 6, 2014 at 06:48:55AM +0100, Simon Riggs wrote: >> On 6 August 2014 03:16, Bruce Momjian wrote: >> > On Wed, Aug 6, 2014 at 09:17:35AM +0900, Michael Paquier wrote: >> >> On Wed, Aug 6, 2014 at 9:04 AM, Simon Riggs wrote: >> >

Re: [HACKERS] Proposal: Incremental Backup

2014-08-07 Thread Simon Riggs

On 6 August 2014 17:27, Bruce Momjian wrote: > On Wed, Aug 6, 2014 at 01:15:32PM -0300, Claudio Freire wrote: >> On Wed, Aug 6, 2014 at 12:20 PM, Bruce Momjian wrote: >> > >> > Well, for file-level backups we have: >> > >> > 1) use file modtime (possibly inaccurate) >> > 2) use f

Re: [HACKERS] Proposal: Incremental Backup

2014-08-06 Thread Bruce Momjian

On Wed, Aug 6, 2014 at 01:15:32PM -0300, Claudio Freire wrote: > On Wed, Aug 6, 2014 at 12:20 PM, Bruce Momjian wrote: > > > > Well, for file-level backups we have: > > > > 1) use file modtime (possibly inaccurate) > > 2) use file modtime and checksums (heavy read load) > > > > Fo

Re: [HACKERS] Proposal: Incremental Backup

2014-08-06 Thread Claudio Freire

On Wed, Aug 6, 2014 at 12:20 PM, Bruce Momjian wrote: > > Well, for file-level backups we have: > > 1) use file modtime (possibly inaccurate) > 2) use file modtime and checksums (heavy read load) > > For block-level backups we have: > > 3) accumulate block numbers as WAL is

Re: [HACKERS] Proposal: Incremental Backup

2014-08-06 Thread Bruce Momjian

On Wed, Aug 6, 2014 at 06:48:55AM +0100, Simon Riggs wrote: > On 6 August 2014 03:16, Bruce Momjian wrote: > > On Wed, Aug 6, 2014 at 09:17:35AM +0900, Michael Paquier wrote: > >> On Wed, Aug 6, 2014 at 9:04 AM, Simon Riggs wrote: > >> > > >> > On 5 August 2014 22:38, Claudio Freire wrote: > >

Re: [HACKERS] Proposal: Incremental Backup

2014-08-05 Thread Simon Riggs

On 6 August 2014 03:16, Bruce Momjian wrote: > On Wed, Aug 6, 2014 at 09:17:35AM +0900, Michael Paquier wrote: >> On Wed, Aug 6, 2014 at 9:04 AM, Simon Riggs wrote: >> > >> > On 5 August 2014 22:38, Claudio Freire wrote: >> > Thinking some more, there seems like this whole store-multiple-LSNs >

Re: [HACKERS] Proposal: Incremental Backup

2014-08-05 Thread Bruce Momjian

On Wed, Aug 6, 2014 at 09:17:35AM +0900, Michael Paquier wrote: > On Wed, Aug 6, 2014 at 9:04 AM, Simon Riggs wrote: > > > > On 5 August 2014 22:38, Claudio Freire wrote: > > Thinking some more, there seems like this whole store-multiple-LSNs > > thing is too much. We can still do block-level in

Re: [HACKERS] Proposal: Incremental Backup

2014-08-05 Thread Claudio Freire

On Tue, Aug 5, 2014 at 9:04 PM, Simon Riggs wrote: > On 5 August 2014 22:38, Claudio Freire wrote: > >>> * When we take an incremental backup we need the WAL from the backup >>> start LSN through to the backup stop LSN. We do not need the WAL >>> between the last backup stop LSN and the new incre

Re: [HACKERS] Proposal: Incremental Backup

2014-08-05 Thread Claudio Freire

On Tue, Aug 5, 2014 at 9:17 PM, Michael Paquier wrote: > On Wed, Aug 6, 2014 at 9:04 AM, Simon Riggs wrote: >> >> On 5 August 2014 22:38, Claudio Freire wrote: >> Thinking some more, there seems like this whole store-multiple-LSNs >> thing is too much. We can still do block-level incrementals ju

Re: [HACKERS] Proposal: Incremental Backup

2014-08-05 Thread Michael Paquier

On Wed, Aug 6, 2014 at 9:04 AM, Simon Riggs wrote: > > On 5 August 2014 22:38, Claudio Freire wrote: > Thinking some more, there seems like this whole store-multiple-LSNs > thing is too much. We can still do block-level incrementals just by > using a single LSN as the reference point. We'd still

Re: [HACKERS] Proposal: Incremental Backup

2014-08-05 Thread Simon Riggs

On 5 August 2014 22:38, Claudio Freire wrote: >> * When we take an incremental backup we need the WAL from the backup >> start LSN through to the backup stop LSN. We do not need the WAL >> between the last backup stop LSN and the new incremental start LSN. >> That is a huge amount of WAL in many

Re: [HACKERS] Proposal: Incremental Backup

2014-08-05 Thread Claudio Freire

On Tue, Aug 5, 2014 at 3:23 PM, Simon Riggs wrote: > On 4 August 2014 19:30, Claudio Freire wrote: >> On Mon, Aug 4, 2014 at 5:15 AM, Gabriele Bartolini >> wrote: >>>I really like the proposal of working on a block level incremental >>> backup feature and the idea of considering LSN. However

Re: [HACKERS] Proposal: Incremental Backup

2014-08-05 Thread Simon Riggs

On 4 August 2014 19:30, Claudio Freire wrote: > On Mon, Aug 4, 2014 at 5:15 AM, Gabriele Bartolini > wrote: >>I really like the proposal of working on a block level incremental >> backup feature and the idea of considering LSN. However, I'd suggest >> to see block level as a second step and a

Re: [HACKERS] Proposal: Incremental Backup

2014-08-05 Thread Gabriele Bartolini

Hi Claudio, I think there has been a misunderstanding. I agree with you (and I think also Marco) that LSN is definitely a component to consider in this process. We will come up with an alternate proposal which considers LSNS either today or tomorrow. ;) Thanks, Gabriele -- Gabriele Bartolini -

Re: [HACKERS] Proposal: Incremental Backup

2014-08-04 Thread Claudio Freire

On Mon, Aug 4, 2014 at 5:15 AM, Gabriele Bartolini wrote: >I really like the proposal of working on a block level incremental > backup feature and the idea of considering LSN. However, I'd suggest > to see block level as a second step and a goal to keep in mind while > working on the first ste

Re: [HACKERS] Proposal: Incremental Backup

2014-08-04 Thread Gabriele Bartolini

Hi guys, sorry if I jump in the middle of the conversation. I have been reading with much interest all that's been said above. However, the goal of this patch is to give users another possibility while performing backups. Especially when large databases are in use. I really like the proposal

Re: [HACKERS] Proposal: Incremental Backup

2014-08-01 Thread Claudio Freire

On Fri, Aug 1, 2014 at 1:43 PM, desmodemone wrote: > > > > 2014-08-01 18:20 GMT+02:00 Claudio Freire : > >> On Fri, Aug 1, 2014 at 12:35 AM, Amit Kapila >> wrote: >> >> c) the map is not crash safe by design, because it needs only for >> >> incremental backup to track what blocks needs to be back

Re: [HACKERS] Proposal: Incremental Backup

2014-08-01 Thread desmodemone

2014-08-01 18:20 GMT+02:00 Claudio Freire : > On Fri, Aug 1, 2014 at 12:35 AM, Amit Kapila > wrote: > >> c) the map is not crash safe by design, because it needs only for > >> incremental backup to track what blocks needs to be backuped, not for > >> consistency or recovery of the whole cluster,

Re: [HACKERS] Proposal: Incremental Backup

2014-08-01 Thread Claudio Freire

On Fri, Aug 1, 2014 at 12:35 AM, Amit Kapila wrote: >> c) the map is not crash safe by design, because it needs only for >> incremental backup to track what blocks needs to be backuped, not for >> consistency or recovery of the whole cluster, so it's not an heavy cost for >> the whole cluster to m

Re: [HACKERS] Proposal: Incremental Backup

2014-07-31 Thread Amit Kapila

On Thu, Jul 31, 2014 at 1:56 PM, desmodemone wrote: > > Hi Amit, thank you for your comments . > However , about drawbacks: > a) It's not clear to me why the method needs checksum enable, I mean, if the bgwriter or another process flushes a dirty buffer, it's only have to signal in the map that th

Re: [HACKERS] Proposal: Incremental Backup

2014-07-31 Thread Claudio Freire

On Thu, Jul 31, 2014 at 5:26 AM, desmodemone wrote: > b) yes the backends need to update the map, but it's in memory, and as I > show, could be very small if we you chunk of blocks.If we not compress the > map, I not think could be a bottleneck. If it's in memory, it's not crash-safe. For somethi

Re: [HACKERS] Proposal: Incremental Backup

2014-07-31 Thread Robert Haas

On Thu, Jul 31, 2014 at 2:00 AM, Amit Kapila wrote: > On Wed, Jul 30, 2014 at 11:32 PM, Robert Haas wrote: >> IMV, the way to eventually make this efficient is to have a background >> process that reads the WAL and figures out which data blocks have been >> modified, and tracks that someplace. >

Re: [HACKERS] Proposal: Incremental Backup

2014-07-31 Thread Bruce Momjian

On Thu, Jul 31, 2014 at 11:30:52AM +0530, Amit Kapila wrote: > On Wed, Jul 30, 2014 at 11:32 PM, Robert Haas wrote: > > > > IMV, the way to eventually make this efficient is to have a background > > process that reads the WAL and figures out which data blocks have been > > modified, and tracks tha

Re: [HACKERS] Proposal: Incremental Backup

2014-07-31 Thread desmodemone

2014-07-31 8:26 GMT+02:00 Amit Kapila : > On Wed, Jul 30, 2014 at 7:00 PM, desmodemone > wrote: > > Hello, > > I think it's very useful an incremental/differential backup > method, by the way > > the method has two drawbacks: > > 1) In a database normally, even if the percent of modi

Re: [HACKERS] Proposal: Incremental Backup

2014-07-30 Thread Michael Paquier

On Thu, Jul 31, 2014 at 3:00 PM, Amit Kapila wrote: > One more thing, what will happen for unlogged tables with such a > mechanism? I imagine that you can safely bypass them as they are not accessible during recovery and will start with empty relation files once recovery ends. The same applies to

Re: [HACKERS] Proposal: Incremental Backup

2014-07-30 Thread Amit Kapila

On Wed, Jul 30, 2014 at 7:00 PM, desmodemone wrote: > Hello, > I think it's very useful an incremental/differential backup method, by the way > the method has two drawbacks: > 1) In a database normally, even if the percent of modify rows is small compared to total rows, the probabilit

Re: [HACKERS] Proposal: Incremental Backup

2014-07-30 Thread Amit Kapila

On Wed, Jul 30, 2014 at 11:32 PM, Robert Haas wrote: > > IMV, the way to eventually make this efficient is to have a background > process that reads the WAL and figures out which data blocks have been > modified, and tracks that someplace. Nice idea, however I think to make this happen we need to

Re: [HACKERS] Proposal: Incremental Backup

2014-07-30 Thread Robert Haas

On Tue, Jul 29, 2014 at 12:35 PM, Marco Nenciarini wrote: >> I agree with much of that. However, I'd question whether we can >> really seriously expect to rely on file modification times for >> critical data-integrity operations. I wouldn't like it if somebody >> ran ntpdate to fix the time whil

Re: [HACKERS] Proposal: Incremental Backup

2014-07-30 Thread desmodemone

2014-07-29 18:35 GMT+02:00 Marco Nenciarini : > Il 25/07/14 20:44, Robert Haas ha scritto: > > On Fri, Jul 25, 2014 at 2:21 PM, Claudio Freire > wrote: > >> On Fri, Jul 25, 2014 at 10:14 AM, Marco Nenciarini > >> wrote: > >>> 1. Proposal > >>> = > >>> Our proposal

Re: [HACKERS] Proposal: Incremental Backup

2014-07-29 Thread Michael Paquier

On Wed, Jul 30, 2014 at 1:11 AM, Marco Nenciarini wrote: > "differential backup" is widely used to refer to a backup that is always > based on a "full backup". An "incremental backup" can be based either on > a "full backup" or on a previous "incremental backup". We picked that > name to emphasize

Re: [HACKERS] Proposal: Incremental Backup

2014-07-29 Thread Marco Nenciarini

Il 25/07/14 20:44, Robert Haas ha scritto: > On Fri, Jul 25, 2014 at 2:21 PM, Claudio Freire > wrote: >> On Fri, Jul 25, 2014 at 10:14 AM, Marco Nenciarini >> wrote: >>> 1. Proposal >>> = >>> Our proposal is to introduce the concept of a backup profile. The backup

Re: [HACKERS] Proposal: Incremental Backup

2014-07-29 Thread Claudio Freire

On Tue, Jul 29, 2014 at 1:24 PM, Marco Nenciarini wrote: >> On Fri, Jul 25, 2014 at 10:14 AM, Marco Nenciarini >> wrote: >>> 1. Proposal >>> = >>> Our proposal is to introduce the concept of a backup profile. The backup >>> profile consists of a file with one line

Re: [HACKERS] Proposal: Incremental Backup

2014-07-29 Thread Marco Nenciarini

Il 25/07/14 20:21, Claudio Freire ha scritto: > On Fri, Jul 25, 2014 at 10:14 AM, Marco Nenciarini > wrote: >> 1. Proposal >> = >> Our proposal is to introduce the concept of a backup profile. The backup >> profile consists of a file with one line per file detailing

Re: [HACKERS] Proposal: Incremental Backup

2014-07-29 Thread Marco Nenciarini

Il 25/07/14 16:15, Michael Paquier ha scritto: > On Fri, Jul 25, 2014 at 10:14 PM, Marco Nenciarini > wrote: >> 0. Introduction: >> = >> This is a proposal for adding incremental backup support to streaming >> protocol and hence to pg_basebackup command. > Not sure

Re: [HACKERS] Proposal: Incremental Backup

2014-07-25 Thread Claudio Freire

On Fri, Jul 25, 2014 at 7:38 PM, Josh Berkus wrote: > On 07/25/2014 11:49 AM, Claudio Freire wrote: >>> I agree with much of that. However, I'd question whether we can >>> > really seriously expect to rely on file modification times for >>> > critical data-integrity operations. I wouldn't like i

Re: [HACKERS] Proposal: Incremental Backup

2014-07-25 Thread Josh Berkus

On 07/25/2014 11:49 AM, Claudio Freire wrote: >> I agree with much of that. However, I'd question whether we can >> > really seriously expect to rely on file modification times for >> > critical data-integrity operations. I wouldn't like it if somebody >> > ran ntpdate to fix the time while the b

Re: [HACKERS] Proposal: Incremental Backup

2014-07-25 Thread Claudio Freire

On Fri, Jul 25, 2014 at 3:44 PM, Robert Haas wrote: > On Fri, Jul 25, 2014 at 2:21 PM, Claudio Freire > wrote: >> On Fri, Jul 25, 2014 at 10:14 AM, Marco Nenciarini >> wrote: >>> 1. Proposal >>> = >>> Our proposal is to introduce the concept of a backup profile.

Re: [HACKERS] Proposal: Incremental Backup

2014-07-25 Thread Robert Haas

On Fri, Jul 25, 2014 at 2:21 PM, Claudio Freire wrote: > On Fri, Jul 25, 2014 at 10:14 AM, Marco Nenciarini > wrote: >> 1. Proposal >> = >> Our proposal is to introduce the concept of a backup profile. The backup >> profile consists of a file with one line per file

Re: [HACKERS] Proposal: Incremental Backup

2014-07-25 Thread Claudio Freire

On Fri, Jul 25, 2014 at 10:14 AM, Marco Nenciarini wrote: > 1. Proposal > = > Our proposal is to introduce the concept of a backup profile. The backup > profile consists of a file with one line per file detailing tablespace, > path, modification time, size and check

Re: [HACKERS] Proposal: Incremental Backup

2014-07-25 Thread Michael Paquier

On Fri, Jul 25, 2014 at 10:14 PM, Marco Nenciarini wrote: > 0. Introduction: > = > This is a proposal for adding incremental backup support to streaming > protocol and hence to pg_basebackup command. Not sure that incremental is a right word as the existing backup m

[HACKERS] Proposal: Incremental Backup

2014-07-25 Thread Marco Nenciarini

0. Introduction: = This is a proposal for adding incremental backup support to streaming protocol and hence to pg_basebackup command. 1. Proposal = Our proposal is to introduce the concept of a backup profile. The backup profile consi

60 matches

Mail list logo