On Tue, Aug 12, 2014 at 8:26 PM, Stephen Frost wrote:
> * Claudio Freire (klaussfre...@gmail.com) wrote:
>> I'm not talking about malicious attacks, with big enough data sets,
>> checksum collisions are much more likely to happen than with smaller
>> ones, and incremental backups are supposed to w
Claudio,
* Claudio Freire (klaussfre...@gmail.com) wrote:
> I'm not talking about malicious attacks, with big enough data sets,
> checksum collisions are much more likely to happen than with smaller
> ones, and incremental backups are supposed to work for the big sets.
This is an issue when you'r
On Wed, Aug 13, 2014 at 12:58 AM, Robert Haas wrote:
> On Tue, Aug 12, 2014 at 10:30 AM, Andres Freund
> wrote:
>>> Still not safe. Checksum collisions do happen, especially in big data sets.
>>
>> If you use an appropriate algorithm for appropriate amounts of data
>> that's not a relevant conce
On Tue, Aug 12, 2014 at 10:30 AM, Andres Freund wrote:
>> Still not safe. Checksum collisions do happen, especially in big data sets.
>
> If you use an appropriate algorithm for appropriate amounts of data
> that's not a relevant concern. You can easily do different checksums for
> every 1GB segme
On Tue, Aug 12, 2014 at 11:17 AM, Gabriele Bartolini
wrote:
>
> 2014-08-12 15:25 GMT+02:00 Claudio Freire :
>> Still not safe. Checksum collisions do happen, especially in big data sets.
>
> Can I ask you what you are currently using for backing up large data
> sets with Postgres?
Currently, a ti
On 2014-08-12 10:25:21 -0300, Claudio Freire wrote:
> On Tue, Aug 12, 2014 at 6:41 AM, Marco Nenciarini
> wrote:
> > To declared two files identical they must have same size,
> > same mtime and same *checksum*.
>
> Still not safe. Checksum collisions do happen, especially in big data sets.
If yo
Il 12/08/14 15:25, Claudio Freire ha scritto:
> On Tue, Aug 12, 2014 at 6:41 AM, Marco Nenciarini
> wrote:
>> To declared two files identical they must have same size,
>> same mtime and same *checksum*.
>
> Still not safe. Checksum collisions do happen, especially in big data sets.
>
IMHO it is
Hi Claudio,
2014-08-12 15:25 GMT+02:00 Claudio Freire :
> Still not safe. Checksum collisions do happen, especially in big data sets.
Can I ask you what you are currently using for backing up large data
sets with Postgres?
Thanks,
Gabriele
--
Sent via pgsql-hackers mailing list (pgsql-hackers
On Tue, Aug 12, 2014 at 6:41 AM, Marco Nenciarini
wrote:
> To declared two files identical they must have same size,
> same mtime and same *checksum*.
Still not safe. Checksum collisions do happen, especially in big data sets.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.or
As I already stated, timestamps will be only used to early detect
changed files. To declared two files identical they must have same size,
same mtime and same *checksum*.
Regards,
Marco
--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant
On Mon, Aug 11, 2014 at 12:27 PM, Robert Haas wrote:
>
>> As Marco says, that can be optimized using filesystem timestamps instead.
>
> The idea of using filesystem timestamps gives me the creeps. Those
> aren't always very granular, and I don't know that (for example) they
> are crash-safe. Doe
On Tue, Aug 5, 2014 at 8:04 PM, Simon Riggs wrote:
> To decide whether we need to re-copy the file, you read the file until
> we find a block with a later LSN. If we read the whole file without
> finding a later LSN then we don't need to re-copy. That means we read
> each file twice, which is slow
On Thu, Aug 7, 2014 at 6:29 PM, Gabriele Bartolini <
gabriele.bartol...@2ndquadrant.it> wrote:
> Hi Marco,
>
> > With the current full backup procedure they are backed up, so I think
> > that having them backed up with a rsync-like algorithm is what an user
> > would expect for an incremental back
Hi Marco,
> With the current full backup procedure they are backed up, so I think
> that having them backed up with a rsync-like algorithm is what an user
> would expect for an incremental backup.
Exactly. I think a simple, flexible and robust method for file based
incremental backup is all we ne
Il 07/08/14 17:25, Bruce Momjian ha scritto:
> On Thu, Aug 7, 2014 at 08:35:53PM +0900, Michael Paquier wrote:
>> On Thu, Aug 7, 2014 at 8:11 PM, Fujii Masao wrote:
>>> There are some data which don't have LSN, for example, postgresql.conf.
>>> When such data has been modified since last backup,
Il 07/08/14 17:29, Bruce Momjian ha scritto:
> I am a little worried that many users will not realize this until they
> try it and are disappointed, e.g. "Why is PG writing to my static data
> so often?" --- then we get beaten up about our hint bits and freezing
> behavior. :-(
>
> I am just tryi
On Thu, Aug 7, 2014 at 11:03:40AM +0100, Simon Riggs wrote:
> Well, there is a huge difference between file-level and block-level backup.
>
> Designing, writing and verifying block-level backup to the point that
> it is acceptable is a huge effort. (Plus, I don't think accumulating
> block number
On Thu, Aug 7, 2014 at 08:35:53PM +0900, Michael Paquier wrote:
> On Thu, Aug 7, 2014 at 8:11 PM, Fujii Masao wrote:
> > There are some data which don't have LSN, for example, postgresql.conf.
> > When such data has been modified since last backup, they also need to
> > be included in incremental
On Thu, Aug 7, 2014 at 8:11 PM, Fujii Masao wrote:
> There are some data which don't have LSN, for example, postgresql.conf.
> When such data has been modified since last backup, they also need to
> be included in incremental backup? Probably yes.
Definitely yes. That's as well the case of paths l
On Thu, Aug 7, 2014 at 12:20 AM, Bruce Momjian wrote:
> On Wed, Aug 6, 2014 at 06:48:55AM +0100, Simon Riggs wrote:
>> On 6 August 2014 03:16, Bruce Momjian wrote:
>> > On Wed, Aug 6, 2014 at 09:17:35AM +0900, Michael Paquier wrote:
>> >> On Wed, Aug 6, 2014 at 9:04 AM, Simon Riggs wrote:
>> >
On 6 August 2014 17:27, Bruce Momjian wrote:
> On Wed, Aug 6, 2014 at 01:15:32PM -0300, Claudio Freire wrote:
>> On Wed, Aug 6, 2014 at 12:20 PM, Bruce Momjian wrote:
>> >
>> > Well, for file-level backups we have:
>> >
>> > 1) use file modtime (possibly inaccurate)
>> > 2) use f
On Wed, Aug 6, 2014 at 01:15:32PM -0300, Claudio Freire wrote:
> On Wed, Aug 6, 2014 at 12:20 PM, Bruce Momjian wrote:
> >
> > Well, for file-level backups we have:
> >
> > 1) use file modtime (possibly inaccurate)
> > 2) use file modtime and checksums (heavy read load)
> >
> > Fo
On Wed, Aug 6, 2014 at 12:20 PM, Bruce Momjian wrote:
>
> Well, for file-level backups we have:
>
> 1) use file modtime (possibly inaccurate)
> 2) use file modtime and checksums (heavy read load)
>
> For block-level backups we have:
>
> 3) accumulate block numbers as WAL is
On Wed, Aug 6, 2014 at 06:48:55AM +0100, Simon Riggs wrote:
> On 6 August 2014 03:16, Bruce Momjian wrote:
> > On Wed, Aug 6, 2014 at 09:17:35AM +0900, Michael Paquier wrote:
> >> On Wed, Aug 6, 2014 at 9:04 AM, Simon Riggs wrote:
> >> >
> >> > On 5 August 2014 22:38, Claudio Freire wrote:
> >
On 6 August 2014 03:16, Bruce Momjian wrote:
> On Wed, Aug 6, 2014 at 09:17:35AM +0900, Michael Paquier wrote:
>> On Wed, Aug 6, 2014 at 9:04 AM, Simon Riggs wrote:
>> >
>> > On 5 August 2014 22:38, Claudio Freire wrote:
>> > Thinking some more, there seems like this whole store-multiple-LSNs
>
On Wed, Aug 6, 2014 at 09:17:35AM +0900, Michael Paquier wrote:
> On Wed, Aug 6, 2014 at 9:04 AM, Simon Riggs wrote:
> >
> > On 5 August 2014 22:38, Claudio Freire wrote:
> > Thinking some more, there seems like this whole store-multiple-LSNs
> > thing is too much. We can still do block-level in
On Tue, Aug 5, 2014 at 9:04 PM, Simon Riggs wrote:
> On 5 August 2014 22:38, Claudio Freire wrote:
>
>>> * When we take an incremental backup we need the WAL from the backup
>>> start LSN through to the backup stop LSN. We do not need the WAL
>>> between the last backup stop LSN and the new incre
On Tue, Aug 5, 2014 at 9:17 PM, Michael Paquier
wrote:
> On Wed, Aug 6, 2014 at 9:04 AM, Simon Riggs wrote:
>>
>> On 5 August 2014 22:38, Claudio Freire wrote:
>> Thinking some more, there seems like this whole store-multiple-LSNs
>> thing is too much. We can still do block-level incrementals ju
On Wed, Aug 6, 2014 at 9:04 AM, Simon Riggs wrote:
>
> On 5 August 2014 22:38, Claudio Freire wrote:
> Thinking some more, there seems like this whole store-multiple-LSNs
> thing is too much. We can still do block-level incrementals just by
> using a single LSN as the reference point. We'd still
On 5 August 2014 22:38, Claudio Freire wrote:
>> * When we take an incremental backup we need the WAL from the backup
>> start LSN through to the backup stop LSN. We do not need the WAL
>> between the last backup stop LSN and the new incremental start LSN.
>> That is a huge amount of WAL in many
On Tue, Aug 5, 2014 at 3:23 PM, Simon Riggs wrote:
> On 4 August 2014 19:30, Claudio Freire wrote:
>> On Mon, Aug 4, 2014 at 5:15 AM, Gabriele Bartolini
>> wrote:
>>>I really like the proposal of working on a block level incremental
>>> backup feature and the idea of considering LSN. However
On 4 August 2014 19:30, Claudio Freire wrote:
> On Mon, Aug 4, 2014 at 5:15 AM, Gabriele Bartolini
> wrote:
>>I really like the proposal of working on a block level incremental
>> backup feature and the idea of considering LSN. However, I'd suggest
>> to see block level as a second step and a
Hi Claudio,
I think there has been a misunderstanding. I agree with you (and I
think also Marco) that LSN is definitely a component to consider in
this process. We will come up with an alternate proposal which
considers LSNS either today or tomorrow. ;)
Thanks,
Gabriele
--
Gabriele Bartolini -
On Mon, Aug 4, 2014 at 5:15 AM, Gabriele Bartolini
wrote:
>I really like the proposal of working on a block level incremental
> backup feature and the idea of considering LSN. However, I'd suggest
> to see block level as a second step and a goal to keep in mind while
> working on the first ste
Hi guys,
sorry if I jump in the middle of the conversation. I have been
reading with much interest all that's been said above. However, the
goal of this patch is to give users another possibility while
performing backups. Especially when large databases are in use.
I really like the proposal
On Fri, Aug 1, 2014 at 1:43 PM, desmodemone wrote:
>
>
>
> 2014-08-01 18:20 GMT+02:00 Claudio Freire :
>
>> On Fri, Aug 1, 2014 at 12:35 AM, Amit Kapila
>> wrote:
>> >> c) the map is not crash safe by design, because it needs only for
>> >> incremental backup to track what blocks needs to be back
2014-08-01 18:20 GMT+02:00 Claudio Freire :
> On Fri, Aug 1, 2014 at 12:35 AM, Amit Kapila
> wrote:
> >> c) the map is not crash safe by design, because it needs only for
> >> incremental backup to track what blocks needs to be backuped, not for
> >> consistency or recovery of the whole cluster,
On Fri, Aug 1, 2014 at 12:35 AM, Amit Kapila wrote:
>> c) the map is not crash safe by design, because it needs only for
>> incremental backup to track what blocks needs to be backuped, not for
>> consistency or recovery of the whole cluster, so it's not an heavy cost for
>> the whole cluster to m
On Thu, Jul 31, 2014 at 1:56 PM, desmodemone wrote:
>
> Hi Amit, thank you for your comments .
> However , about drawbacks:
> a) It's not clear to me why the method needs checksum enable, I mean, if
the bgwriter or another process flushes a dirty buffer, it's only have to
signal in the map that th
On Thu, Jul 31, 2014 at 5:26 AM, desmodemone wrote:
> b) yes the backends need to update the map, but it's in memory, and as I
> show, could be very small if we you chunk of blocks.If we not compress the
> map, I not think could be a bottleneck.
If it's in memory, it's not crash-safe. For somethi
On Thu, Jul 31, 2014 at 2:00 AM, Amit Kapila wrote:
> On Wed, Jul 30, 2014 at 11:32 PM, Robert Haas wrote:
>> IMV, the way to eventually make this efficient is to have a background
>> process that reads the WAL and figures out which data blocks have been
>> modified, and tracks that someplace.
>
On Thu, Jul 31, 2014 at 11:30:52AM +0530, Amit Kapila wrote:
> On Wed, Jul 30, 2014 at 11:32 PM, Robert Haas wrote:
> >
> > IMV, the way to eventually make this efficient is to have a background
> > process that reads the WAL and figures out which data blocks have been
> > modified, and tracks tha
2014-07-31 8:26 GMT+02:00 Amit Kapila :
> On Wed, Jul 30, 2014 at 7:00 PM, desmodemone
> wrote:
> > Hello,
> > I think it's very useful an incremental/differential backup
> method, by the way
> > the method has two drawbacks:
> > 1) In a database normally, even if the percent of modi
On Thu, Jul 31, 2014 at 3:00 PM, Amit Kapila wrote:
> One more thing, what will happen for unlogged tables with such a
> mechanism?
I imagine that you can safely bypass them as they are not accessible
during recovery and will start with empty relation files once recovery
ends. The same applies to
On Wed, Jul 30, 2014 at 7:00 PM, desmodemone wrote:
> Hello,
> I think it's very useful an incremental/differential backup
method, by the way
> the method has two drawbacks:
> 1) In a database normally, even if the percent of modify rows is small
compared to total rows, the probabilit
On Wed, Jul 30, 2014 at 11:32 PM, Robert Haas wrote:
>
> IMV, the way to eventually make this efficient is to have a background
> process that reads the WAL and figures out which data blocks have been
> modified, and tracks that someplace.
Nice idea, however I think to make this happen we need to
On Tue, Jul 29, 2014 at 12:35 PM, Marco Nenciarini
wrote:
>> I agree with much of that. However, I'd question whether we can
>> really seriously expect to rely on file modification times for
>> critical data-integrity operations. I wouldn't like it if somebody
>> ran ntpdate to fix the time whil
2014-07-29 18:35 GMT+02:00 Marco Nenciarini :
> Il 25/07/14 20:44, Robert Haas ha scritto:
> > On Fri, Jul 25, 2014 at 2:21 PM, Claudio Freire
> wrote:
> >> On Fri, Jul 25, 2014 at 10:14 AM, Marco Nenciarini
> >> wrote:
> >>> 1. Proposal
> >>> =
> >>> Our proposal
On Wed, Jul 30, 2014 at 1:11 AM, Marco Nenciarini
wrote:
> "differential backup" is widely used to refer to a backup that is always
> based on a "full backup". An "incremental backup" can be based either on
> a "full backup" or on a previous "incremental backup". We picked that
> name to emphasize
Il 25/07/14 20:44, Robert Haas ha scritto:
> On Fri, Jul 25, 2014 at 2:21 PM, Claudio Freire
> wrote:
>> On Fri, Jul 25, 2014 at 10:14 AM, Marco Nenciarini
>> wrote:
>>> 1. Proposal
>>> =
>>> Our proposal is to introduce the concept of a backup profile. The backup
On Tue, Jul 29, 2014 at 1:24 PM, Marco Nenciarini
wrote:
>> On Fri, Jul 25, 2014 at 10:14 AM, Marco Nenciarini
>> wrote:
>>> 1. Proposal
>>> =
>>> Our proposal is to introduce the concept of a backup profile. The backup
>>> profile consists of a file with one line
Il 25/07/14 20:21, Claudio Freire ha scritto:
> On Fri, Jul 25, 2014 at 10:14 AM, Marco Nenciarini
> wrote:
>> 1. Proposal
>> =
>> Our proposal is to introduce the concept of a backup profile. The backup
>> profile consists of a file with one line per file detailing
Il 25/07/14 16:15, Michael Paquier ha scritto:
> On Fri, Jul 25, 2014 at 10:14 PM, Marco Nenciarini
> wrote:
>> 0. Introduction:
>> =
>> This is a proposal for adding incremental backup support to streaming
>> protocol and hence to pg_basebackup command.
> Not sure
On Fri, Jul 25, 2014 at 7:38 PM, Josh Berkus wrote:
> On 07/25/2014 11:49 AM, Claudio Freire wrote:
>>> I agree with much of that. However, I'd question whether we can
>>> > really seriously expect to rely on file modification times for
>>> > critical data-integrity operations. I wouldn't like i
On 07/25/2014 11:49 AM, Claudio Freire wrote:
>> I agree with much of that. However, I'd question whether we can
>> > really seriously expect to rely on file modification times for
>> > critical data-integrity operations. I wouldn't like it if somebody
>> > ran ntpdate to fix the time while the b
On Fri, Jul 25, 2014 at 3:44 PM, Robert Haas wrote:
> On Fri, Jul 25, 2014 at 2:21 PM, Claudio Freire
> wrote:
>> On Fri, Jul 25, 2014 at 10:14 AM, Marco Nenciarini
>> wrote:
>>> 1. Proposal
>>> =
>>> Our proposal is to introduce the concept of a backup profile.
On Fri, Jul 25, 2014 at 2:21 PM, Claudio Freire wrote:
> On Fri, Jul 25, 2014 at 10:14 AM, Marco Nenciarini
> wrote:
>> 1. Proposal
>> =
>> Our proposal is to introduce the concept of a backup profile. The backup
>> profile consists of a file with one line per file
On Fri, Jul 25, 2014 at 10:14 AM, Marco Nenciarini
wrote:
> 1. Proposal
> =
> Our proposal is to introduce the concept of a backup profile. The backup
> profile consists of a file with one line per file detailing tablespace,
> path, modification time, size and check
On Fri, Jul 25, 2014 at 10:14 PM, Marco Nenciarini
wrote:
> 0. Introduction:
> =
> This is a proposal for adding incremental backup support to streaming
> protocol and hence to pg_basebackup command.
Not sure that incremental is a right word as the existing backup
m
0. Introduction:
=
This is a proposal for adding incremental backup support to streaming
protocol and hence to pg_basebackup command.
1. Proposal
=
Our proposal is to introduce the concept of a backup profile. The backup
profile consi
60 matches
Mail list logo