Re: [HACKERS] Point in Time Recovery

2004-07-30 Thread Mark Kirkwood
Ok - that is a much better way of doing it! regards Mark Tom Lane wrote: "Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes: If you use a readable file you will also need a feature for restore (or a tool) to create an appropriate pg_control file, or are you intending to still require that pg

Re: [HACKERS] Point in Time Recovery

2004-07-30 Thread Tom Lane
"Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes: > If you use a readable file you will also need a feature for restore > (or a tool) to create an appropriate pg_control file, or are you > intending to still require that pg_control be the first file backed > up. No, the entire point of this

Re: [HACKERS] Point in Time Recovery

2004-07-30 Thread Bruce Momjian
Zeugswetter Andreas SB SD wrote: > > > > I was wondering about this point - might it not be just as reasonable > > > for the copied file to *be* an exact image of pg_control? Then a very > > > simple variant of pg_controldata (or maybe even just adding switches to > > > pg_controldata itself)

Re: [HACKERS] Point in Time Recovery

2004-07-30 Thread Zeugswetter Andreas SB SD
> > I was wondering about this point - might it not be just as reasonable > > for the copied file to *be* an exact image of pg_control? Then a very > > simple variant of pg_controldata (or maybe even just adding switches to > > pg_controldata itself) would enable the relevant info to be extrac

Re: [HACKERS] Point in Time Recovery

2004-07-29 Thread markir
Quoting Bruce Momjian <[EMAIL PROTECTED]>: > Mark Kirkwood wrote: > > I was wondering about this point - might it not be just as reasonable > > for the copied file to *be* an exact image of pg_control? Then a very > > simple variant of pg_controldata (or maybe even just adding switches to > > pg_

Re: [HACKERS] Point in Time Recovery

2004-07-29 Thread Bruce Momjian
Mark Kirkwood wrote: > I was wondering about this point - might it not be just as reasonable > for the copied file to *be* an exact image of pg_control? Then a very > simple variant of pg_controldata (or maybe even just adding switches to > pg_controldata itself) would enable the relevant info

Re: [HACKERS] Point in Time Recovery

2004-07-29 Thread Mark Kirkwood
I was wondering about this point - might it not be just as reasonable for the copied file to *be* an exact image of pg_control? Then a very simple variant of pg_controldata (or maybe even just adding switches to pg_controldata itself) would enable the relevant info to be extracted P.s : would

Re: [ADMIN] [HACKERS] Point in Time Recovery

2004-07-28 Thread Bruce Momjian
[ Sorry, sent to hackers now.] Here is another open PITR issue that I think will have to be addressed in 7.6. If you do a critical transaction, but do nothing else for eight hours, that critical transaction hasn't been archived yet. It is still sitting in pg_xlog until the WAL file fills. I th

Re: [HACKERS] Point in Time Recovery

2004-07-28 Thread Bruce Momjian
Oh, here is something else we need to add --- a GUC to control whether pg_xlog is clean on recovery start. --- Tom Lane wrote: > Bruce and I had another phone chat about the problems that can ensue > if you restore a tar bac

Re: [HACKERS] Point in Time Recovery

2004-07-28 Thread Bruce Momjian
We need someone to code two backend functions to complete PITR. The function would be called at start/stop of backup of the data directory. The functions would be checked during restore to make sure the requested xid is not between the start/stop xids of the backup. They would also contain times

Re: [PATCHES] [HACKERS] Point in Time Recovery

2004-07-27 Thread markw
On 26 Jul, To: [EMAIL PROTECTED] wrote: > Sorry I wasn't clearer. I think I have a better idea about what's going > on now. With the archiving enabled, it looks like the database is able > to complete 1 transaction per database connection, but doesn't complete > any subsequent transactions. I'm n

Re: [HACKERS] Point in Time Recovery

2004-07-20 Thread Bruce Momjian
Simon Riggs wrote: > On Sat, 2004-07-17 at 00:57, Bruce Momjian wrote: > > OK, I think I have some solid ideas and reasons for them. > > > > Sorry for taking so long to reply... > > > First, I think we need server-side functions to call when we start/stop > > the backup. The advantage of these

Re: [HACKERS] Point in Time Recovery

2004-07-20 Thread Zeugswetter Andreas SB SD
> > Hang on, are you supposed to MOVE or COPY away WAL segments? > > Copy. pg will delete them once they are archived. Copy. pg will recycle them once they are archived. Andreas ---(end of broadcast)--- TIP 9: the planner will ignore your desire

Re: [HACKERS] Point in Time Recovery

2004-07-19 Thread Christopher Kings-Lynne
I don't think so, but it seems like a much less robust way to do things. What happens if you have a failure partway through? For instance archive machine dies and loses recent data right after you've rm'd the source file. The recommended COPY procedure at least provides some breathing room betwee

Re: [HACKERS] Point in Time Recovery

2004-07-19 Thread Tom Lane
Christopher Kings-Lynne <[EMAIL PROTECTED]> writes: >>> Hang on, are you supposed to MOVE or COPY away WAL segments? >> >> COPY. The checkpoint code will then delete or recycle the segment file, >> as appropriate. > So what happens if you just move it? Postgres breaks? I don't think so, but it

Re: [HACKERS] Point in Time Recovery

2004-07-19 Thread Christopher Kings-Lynne
Hang on, are you supposed to MOVE or COPY away WAL segments? COPY. The checkpoint code will then delete or recycle the segment file, as appropriate. So what happens if you just move it? Postgres breaks? Chris ---(end of broadcast)--- TIP 8: explain

Re: [HACKERS] Point in Time Recovery

2004-07-19 Thread Bruce Momjian
Christopher Kings-Lynne wrote: > > If you keep falling further and further behind, eventually your pg_xlog > > directory will fill the space available on its disk, and I think at that > > point PG will panic and shut down because it can't create any more xlog > > segments. > > Hang on, are you sup

Re: [HACKERS] Point in Time Recovery

2004-07-19 Thread Tom Lane
Christopher Kings-Lynne <[EMAIL PROTECTED]> writes: >> If you keep falling further and further behind, eventually your pg_xlog >> directory will fill the space available on its disk, and I think at that >> point PG will panic and shut down because it can't create any more xlog >> segments. > Hang

Re: [HACKERS] Point in Time Recovery

2004-07-19 Thread Christopher Kings-Lynne
If you keep falling further and further behind, eventually your pg_xlog directory will fill the space available on its disk, and I think at that point PG will panic and shut down because it can't create any more xlog segments. Hang on, are you supposed to MOVE or COPY away WAL segments? Chris -

Re: [HACKERS] Point in Time Recovery

2004-07-19 Thread Tom Lane
Christopher Kings-Lynne <[EMAIL PROTECTED]> writes: > I've got a PITR set up here that's happily scp'ing WAL files across to > another machine. However, the NIC in the machine is currently stuffed, > so it gets like 50k/s :) What happens in general if you are generating > WAL file bytes faster

Re: [HACKERS] Point in Time Recovery

2004-07-19 Thread Christopher Kings-Lynne
I've got a PITR set up here that's happily scp'ing WAL files across to another machine. However, the NIC in the machine is currently stuffed, so it gets like 50k/s :) What happens in general if you are generating WAL file bytes faster always than they can be copied off? Also, does the archive

Re: [HACKERS] Point in Time Recovery

2004-07-19 Thread Tom Lane
Bruce and I had another phone chat about the problems that can ensue if you restore a tar backup that contains old (incompletely filled) versions of WAL segment files. While the current code will ignore them during the recovery-from-archive run, leaving them laying around seems awfully dangerous.

Re: [HACKERS] Point in Time Recovery

2004-07-19 Thread Simon Riggs
On Sat, 2004-07-17 at 00:57, Bruce Momjian wrote: > OK, I think I have some solid ideas and reasons for them. > Sorry for taking so long to reply... > First, I think we need server-side functions to call when we start/stop > the backup. The advantage of these server-side functions is that they

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Bruce Momjian
Let me address you concerns about PITR getting into 7.5. I think a few people spoke last week expressing concern about our release process and wanting to take drastic action. However, looking at the release status report I am about to post, you will see we are on track for an August 1 beta. PITR

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Bruce Momjian
OK, I think I have some solid ideas and reasons for them. First, I think we need server-side functions to call when we start/stop the backup. The advantage of these server-side functions is that they will do the required work of recording the pg_control values and creating needed files with litt

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Simon Riggs
On Fri, 2004-07-16 at 16:47, Tom Lane wrote: > As far as the business about copying pg_control first goes: there is > another way to think about it, which is to copy pg_control to another > place that will be included in your backup. For example the standard > backup procedure could be > > 1. [so

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Simon Riggs
On Fri, 2004-07-16 at 19:30, Bruce Momjian wrote: > Simon Riggs wrote: > > On Fri, 2004-07-16 at 16:58, Zeugswetter Andreas SB SD wrote: > > > > >> Do we need a checkpoint after the archiving > > > > >> starts but before the backup begins? > > > > > > > > > No. > > > > > > > > Actually yes. > > >

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Bruce Momjian
Simon Riggs wrote: > On Fri, 2004-07-16 at 16:58, Zeugswetter Andreas SB SD wrote: > > > >> Do we need a checkpoint after the archiving > > > >> starts but before the backup begins? > > > > > > > No. > > > > > > Actually yes. > > > > Sorry, I did incorrectly not connect 'archiving' with the back

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Simon Riggs
On Fri, 2004-07-16 at 16:25, Zeugswetter Andreas SB SD wrote: > I think the filename 'recovery.conf' is misleading, since it is not a > static configuration file, but a command file for one recovery. > How about 'recovery.command' then 'recovery.inprogress', and on recovery > completion it shoul

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Simon Riggs
On Fri, 2004-07-16 at 15:27, Bruce Momjian wrote: > Also, when you are in recovery mode, how do you get out of recovery > mode, meaning if you have a power failure, how do you prevent the system > from doing another recovery? Do you remove the recovery.conf file? That was the whole point of the

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Simon Riggs
On Fri, 2004-07-16 at 16:58, Zeugswetter Andreas SB SD wrote: > > >> Do we need a checkpoint after the archiving > > >> starts but before the backup begins? > > > > > No. > > > > Actually yes. > > Sorry, I did incorrectly not connect 'archiving' with the backed up xlogs :-( > So yes, you need on

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Zeugswetter Andreas SB SD
> >> Do we need a checkpoint after the archiving > >> starts but before the backup begins? > > > No. > > Actually yes. Sorry, I did incorrectly not connect 'archiving' with the backed up xlogs :-( So yes, you need one checkpoint after archiving starts. Imho turning on xlog archiving should issu

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Tom Lane
"Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes: >> Do we need a checkpoint after the archiving >> starts but before the backup begins? > No. Actually yes. You have to start at a checkpoint record when replaying the log, so if no checkpoint occurred between starting to archive WAL and sta

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Zeugswetter Andreas SB SD
> then on restore once all the files are restored move the > pg_control.backup to its original name. That gives us the checkpoint > wal/offset but how do we get the start/stop information. Is that not > required? The checkpoint wal/offset is in pg_control, that is sufficient start information.

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes: > Also, when you are in recovery mode, how do you get out of recovery > mode, meaning if you have a power failure, how do you prevent the system > from doing another recovery? Do you remove the recovery.conf file? I do not care for the idea of a recovery.

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Bruce Momjian
> "Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes: > > We only need to tell people to backup pg_control first. The rest was only > > intended to enforce > > 1. that pg_control is the first file backed up > > 2. the dba uses a large enough PIT (or xid) for restore > > Right, but I think Br

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Tom Lane
"Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes: > We only need to tell people to backup pg_control first. The rest was only > intended to enforce > 1. that pg_control is the first file backed up > 2. the dba uses a large enough PIT (or xid) for restore Right, but I think Bruce's point is

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Zeugswetter Andreas SB SD
> > I'm aiming for the minimum feature set - which means we do need to take > > care over whether that set is insufficient and also to pull any part > > that doesn't stand up to close scrutiny over the next few days. > > As you can see, we are still chewing on NT. What PITR features are > missin

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Simon Riggs
On Fri, 2004-07-16 at 04:49, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > On Fri, 2004-07-16 at 00:01, Alvaro Herrera wrote: > >> My manpage for signal(2) says that you shouldn't assign SIG_IGN to > >> SIGCHLD, according to POSIX. > > > So - I should be setting this to SIG_DFL and

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Christopher Kings-Lynne
Thanks for that. My comments were heartfelt, but not useful right now. Hi Simon, I'm sorry if I gave the impression that I thought your work wasn't worthwhile, it is :( I'm badly overdrawn already on my time budget, though that is my concern alone. There is more to do than I have time for. Prag

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > On Fri, 2004-07-16 at 00:01, Alvaro Herrera wrote: >> My manpage for signal(2) says that you shouldn't assign SIG_IGN to >> SIGCHLD, according to POSIX. > So - I should be setting this to SIG_DFL and thats good for everyone? Yeah, we learned the same less

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Simon Riggs
On Fri, 2004-07-16 at 00:46, Mark Kirkwood wrote: > > By way of contrast, using the *same* procedure (1-11), but generating 2 > logs worth of INSERTS/UPDATES using 10 concurrent process *works fine* - > e.g : > Great...at least we have shown that something works (or can work) and have begun t

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Simon Riggs
On Fri, 2004-07-16 at 00:01, Alvaro Herrera wrote: > On Thu, Jul 15, 2004 at 11:44:02PM +0100, Simon Riggs wrote: > > On Thu, 2004-07-15 at 13:16, HISADAMasaki wrote: > > > > -- line 236 --- > > > - pgsignal(SIGCHLD, SIG_IGN); > > > > > > -- line 236 --- > > > + pgsignal(SIGCHLD, SIG_DFL); > > >

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Mark Kirkwood
Simon Riggs wrote: So far: I've tried to re-create the problem as exactly as I can, but it works for me. This is clearly an important case to chase down. I assume that this is the very first time you tried recovery? Second and subsequent recoveries using the same set have a potential loophole, wh

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Mark Kirkwood
Couldn't agree more. Maybe we should have made more noise :-) Glen Parker wrote: Simon Riggs wrote: On Thu, 2004-07-15 at 23:18, Devrim GUNDUZ wrote: Thanks for the vote of confidence, on or off list. too many people spend a lot of money for proprietary databases, just for som

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Glen Parker
> Simon Riggs wrote: > > > On Thu, 2004-07-15 at 23:18, Devrim GUNDUZ wrote: > > > > Thanks for the vote of confidence, on or off list. > > > > > too many people spend a lot of > > > money for proprietary databases, just for some missing features in > > > PostgreSQL > > > > Agreed - PITR isn't ai

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Bruce Momjian
Simon Riggs wrote: > On Thu, 2004-07-15 at 15:57, Bruce Momjian wrote: > > > We will get there --- it just seems dark at this time. > > Thanks for that. My comments were heartfelt, but not useful right now. > > I'm badly overdrawn already on my time budget, though that is my concern > alone. Th

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Mark Kirkwood
Simon Riggs wrote: First, thanks for sticking with it to test this. I've not received such a message myself - this is interesting. Is it possible to copy that directory to one side and re-run the test? Add another parameter in postgresql.conf called "archive_debug = true" Does it happen identicall

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Bruce Momjian
Simon Riggs wrote: > > On Thu, 2004-07-15 at 23:18, Devrim GUNDUZ wrote: > > Thanks for the vote of confidence, on or off list. > > > too many people spend a lot of > > money for proprietary databases, just for some missing features in > > PostgreSQL > > Agreed - PITR isn't aimed at existin

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Alvaro Herrera
On Thu, Jul 15, 2004 at 11:44:02PM +0100, Simon Riggs wrote: > On Thu, 2004-07-15 at 13:16, HISADAMasaki wrote: > > -- line 236 --- > > - pgsignal(SIGCHLD, SIG_IGN); > > > > -- line 236 --- > > + pgsignal(SIGCHLD, SIG_DFL); > > I'm not sure I understand why its returned -1, though I'll take you

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Simon Riggs
> On Thu, 2004-07-15 at 23:18, Devrim GUNDUZ wrote: Thanks for the vote of confidence, on or off list. > too many people spend a lot of > money for proprietary databases, just for some missing features in > PostgreSQL Agreed - PITR isn't aimed at existing users of PostgreSQL. If you use it

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Simon Riggs
On Thu, 2004-07-15 at 13:16, HISADAMasaki wrote: > Dear Simon, > > I've just tested pitr_v5_2.patch and got an error message > during archiving process as follows. > > -- begin > LOG: archive command="cp /usr/local/pgsql/data/pg_xlog/ > /tmp",return code=-1 > -- end > > The co

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Devrim GUNDUZ
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Simon, On Thu, 15 Jul 2004, Simon Riggs wrote: > > We will get there --- it just seems dark at this time. > > Thanks for that. My comments were heartfelt, but not useful right now. > > I'm badly overdrawn already on my time budget, though that is

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Simon Riggs
On Thu, 2004-07-15 at 15:57, Bruce Momjian wrote: > We will get there --- it just seems dark at this time. Thanks for that. My comments were heartfelt, but not useful right now. I'm badly overdrawn already on my time budget, though that is my concern alone. There is more to do than I have time

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Simon Riggs
On Thu, 2004-07-15 at 10:47, Mark Kirkwood wrote: > I tried what I thought was a straightforward scenario, and seem to have > broken it :-( > > Here is the little tale > > 1) initdb > 2) set archive_mode and archive_dest in postgresql.conf > 3) startup > 4) create database called 'test' > 5) con

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Bruce Momjian
Simon Riggs wrote: > On Wed, 2004-07-14 at 10:57, Zeugswetter Andreas SB SD wrote: > > > The recovery mechanism doesn't rely upon you knowing 1 or 3. The > > > recovery reads pg_control (from the backup) and then attempts to > > > de-archive the appropriate xlog segment file and then starts > > >

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Bruce Momjian
Simon Riggs wrote: > On Thu, 2004-07-15 at 03:02, Bruce Momjian wrote: > > I talked to Tom on the phone today and and I think we have a procedure > > for doing backup/restore in a fairly foolproof way. > > > > As outlined below, we need to record the start/stop and checkpoint WAL > > file names an

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Zeugswetter Andreas SB SD
Sorry for the stupid question, but how do I get this patch if I do not receive the patches mails ? The web interface html'ifies it, thus making it unusable. Thanks Andreas ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ?

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread HISADAMasaki
Dear Simon, I've just tested pitr_v5_2.patch and got an error message during archiving process as follows. -- begin LOG: archive command="cp /usr/local/pgsql/data/pg_xlog/ /tmp",return code=-1 -- end The command called in system(3) works, but it returns -1. system(3) can not g

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Zeugswetter Andreas SB SD
> > Other db's have commands for: > > start/end external backup I see that the analogy to external backup was not good, since you are correct that dba's would expect that to stop all writes, so they can safely split their mirror or some such. Usually the expected time from start until end extern

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Mark Kirkwood
I tried what I thought was a straightforward scenario, and seem to have broken it :-( Here is the little tale 1) initdb 2) set archive_mode and archive_dest in postgresql.conf 3) startup 4) create database called 'test' 5) connect to 'test' and type 'checkpoint' 6) backup PGDATA using 'tar -zcvf'

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Simon Riggs
On Thu, 2004-07-15 at 03:02, Bruce Momjian wrote: > I talked to Tom on the phone today and and I think we have a procedure > for doing backup/restore in a fairly foolproof way. > > As outlined below, we need to record the start/stop and checkpoint WAL > file names and offsets, and somehow pass tho

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Simon Riggs
On Thu, 2004-07-15 at 02:43, Mark Kirkwood wrote: > I noticed that compiling with 5_1 patch applied fails due to > XLOG_archive_dir being removed from xlog.c , but > src/backend/commands/tablecmds.c still uses it. > > I did the following to tablecmds.c : > > 5408c5408 > < extern c

Re: [HACKERS] Point in Time Recovery

2004-07-14 Thread Bruce Momjian
I talked to Tom on the phone today and and I think we have a procedure for doing backup/restore in a fairly foolproof way. As outlined below, we need to record the start/stop and checkpoint WAL file names and offsets, and somehow pass those on to restore. I think any system that requires users t

Re: [HACKERS] Point in Time Recovery

2004-07-14 Thread SAKATA Tetsuo
Hi, folks. My colleages and I are planning to test PITR after the 7.5 beta release. Now we are desinging test items, but some specification are enough clear (to us). For example, we are not clear which resouce manager order to store log records. - some access method (like B-tree) require to log i

Re: [HACKERS] Point in Time Recovery

2004-07-14 Thread Mark Kirkwood
I noticed that compiling with 5_1 patch applied fails due to XLOG_archive_dir being removed from xlog.c , but src/backend/commands/tablecmds.c still uses it. I did the following to tablecmds.c : 5408c5408 < extern char XLOG_archive_dir[]; --- > extern char *XLogArchiv

Re: [HACKERS] Point in Time Recovery

2004-07-14 Thread Simon Riggs
On Wed, 2004-07-14 at 10:57, Zeugswetter Andreas SB SD wrote: > > The recovery mechanism doesn't rely upon you knowing 1 or 3. The > > recovery reads pg_control (from the backup) and then attempts to > > de-archive the appropriate xlog segment file and then starts > > rollforward > > Unfortunatel

Re: [HACKERS] Point in Time Recovery

2004-07-14 Thread Simon Riggs
On Wed, 2004-07-14 at 16:55, [EMAIL PROTECTED] wrote: > On 14 Jul, Simon Riggs wrote: > > PITR Patch v5_1 just posted has Point in Time Recovery working > > > > Still some rough edgesbut we really need some testers now to give > > this a try and let me know what you think. > > > > Klaus N

Re: [HACKERS] Point in Time Recovery

2004-07-14 Thread markw
On 14 Jul, Simon Riggs wrote: > PITR Patch v5_1 just posted has Point in Time Recovery working > > Still some rough edgesbut we really need some testers now to give > this a try and let me know what you think. > > Klaus Naumann and Mark Wong are the only [non-committers] to have tried > t

Re: [HACKERS] Point in Time Recovery

2004-07-14 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > I've not done power off tests, yet. They need to be done just to > check...actually you don't need to do this to test PITR... I agree, power off is not really the point here. What we need to check into is (a) the mechanics of archiving WAL segments and (b

Re: [HACKERS] Point in Time Recovery

2004-07-14 Thread Zeugswetter Andreas SB SD
> The recovery mechanism doesn't rely upon you knowing 1 or 3. The > recovery reads pg_control (from the backup) and then attempts to > de-archive the appropriate xlog segment file and then starts > rollforward Unfortunately this only works if pg_control was the first file to be backed up (or b

Re: [HACKERS] Point in Time Recovery

2004-07-14 Thread Simon Riggs
On Wed, 2004-07-14 at 03:31, Christopher Kings-Lynne wrote: > Can you give us some suggestions of what kind of stuff to test? Is > there a way we can artificially kill the backend in all sorts of nasty > spots to see if recovery works? Does kill -9 simulate a 'power off'? > I was hoping some

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Christopher Kings-Lynne
Can you give us some suggestions of what kind of stuff to test? Is there a way we can artificially kill the backend in all sorts of nasty spots to see if recovery works? Does kill -9 simulate a 'power off'? Chris Simon Riggs wrote: PITR Patch v5_1 just posted has Point in Time Recovery working

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Simon Riggs
On Wed, 2004-07-14 at 00:01, Bruce Momjian wrote: > Tom Lane wrote: > > Simon Riggs <[EMAIL PROTECTED]> writes: > > > So the situation is: > > > - You must only stop recovery at a point in time (in the logs) after the > > > backup had completed. > > > > Right. > > > > > No way to enforce that cu

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Bruce Momjian
Simon Riggs wrote: > On Tue, 2004-07-13 at 22:19, Tom Lane wrote: > > > To have a consistent recovery at all, you must replay the log starting > > from a checkpoint before the backup began and extending to the time that > > the backup finished. You only get to decide where to stop after that > >

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Simon Riggs
PITR Patch v5_1 just posted has Point in Time Recovery working Still some rough edgesbut we really need some testers now to give this a try and let me know what you think. Klaus Naumann and Mark Wong are the only [non-committers] to have tried to run the code (and let me know about it), s

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Simon Riggs
On Wed, 2004-07-14 at 00:28, Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > OK, but procedurally, how do you correlate the start/stop time of the > > tar backup with the WAL numeric file names? > > Ideally the procedure for making a backup would go something like: > > 1. Inquire

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes: > OK, but procedurally, how do you correlate the start/stop time of the > tar backup with the WAL numeric file names? Ideally the procedure for making a backup would go something like: 1. Inquire of the server its current time and the WAL position of the

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Bruce Momjian
Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > So the situation is: > > - You must only stop recovery at a point in time (in the logs) after the > > backup had completed. > > Right. > > > No way to enforce that currently, apart from procedurally. Not exactly > > frequent, so I thi

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Simon Riggs
On Tue, 2004-07-13 at 23:42, Bruce Momjian wrote: > Simon Riggs wrote: > > On Tue, 2004-07-13 at 22:19, Tom Lane wrote: > > > > > To have a consistent recovery at all, you must replay the log starting > > > from a checkpoint before the backup began and extending to the time that > > > the backup f

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > So the situation is: > - You must only stop recovery at a point in time (in the logs) after the > backup had completed. Right. > No way to enforce that currently, apart from procedurally. Not exactly > frequent, so I think I just document that and move o

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Simon Riggs
On Tue, 2004-07-13 at 22:19, Tom Lane wrote: > To have a consistent recovery at all, you must replay the log starting > from a checkpoint before the backup began and extending to the time that > the backup finished. You only get to decide where to stop after that > point. > So the situation is:

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > I'm getting carried away with the improbablebut this is the rather > strange, but possible scenario I foresee: > A sequence of times... > 1. We start archiving xlogs > 2. We take a checkpoint > 3. we commit an important transaction > 4. We take a backu

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Simon Riggs
On Tue, 2004-07-13 at 15:29, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > Please tell me that we can ignore the state of the clog, > > We can. > In general, you are of course correct. > The reason that keeping track of timelines is interesting for xlog is > simply to take pity

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > Please tell me that we can ignore the state of the clog, We can. The reason that keeping track of timelines is interesting for xlog is simply to take pity on the poor DBA who needs to distinguish the various archived xlog files he's got laying about, and

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Simon Riggs
On Tue, 2004-07-13 at 13:18, Zeugswetter Andreas SB SD wrote: > > The starting a new timeline thought works for xlogs, but not for clogs. > > No matter how far you go into the future, there is a small (yet > > vanishing) possibility that there is a yet undiscovered committed > > transaction in the

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Zeugswetter Andreas SB SD
> The starting a new timeline thought works for xlogs, but not for clogs. > No matter how far you go into the future, there is a small (yet > vanishing) possibility that there is a yet undiscovered committed > transaction in the future. (Because transactions are ordered in the clog > because xids

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Simon Riggs
On Tue, 2004-07-06 at 22:40, Simon Riggs wrote: > On Mon, 2004-07-05 at 22:46, Tom Lane wrote: > > Simon Riggs <[EMAIL PROTECTED]> writes: > > > > - when we stop, keep reading records until EOF, just don't apply them. > > > When we write a checkpoint at end of recovery, the unapplied > > > transac

Re: [HACKERS] Point in Time Recovery

2004-07-10 Thread Simon Riggs
On Sat, 2004-07-10 at 15:17, Jan Wieck wrote: > On 7/6/2004 3:58 PM, Simon Riggs wrote: > > > On Tue, 2004-07-06 at 08:38, Zeugswetter Andreas SB SD wrote: > >> > - by time - but the time stamp on each xlog record only specifies to the > >> > second, which could easily be 10 or more commits (we h

Re: [HACKERS] Point in Time Recovery

2004-07-10 Thread Jan Wieck
On 7/6/2004 3:58 PM, Simon Riggs wrote: On Tue, 2004-07-06 at 08:38, Zeugswetter Andreas SB SD wrote: > - by time - but the time stamp on each xlog record only specifies to the > second, which could easily be 10 or more commits (we hope) > > Should we use a different datatype than time_t for

Re: [HACKERS] Point in Time Recovery

2004-07-09 Thread spock
On Thu, 8 Jul 2004, Simon Riggs wrote: > We don't need to mention timelines in the docs, nor do we need to alter > pg_controldata to display it...just a comment in the code to explain why > we add a large number to the LogId after each recovery completes. I'd disagree on that. Knowing what exactl

Re: [HACKERS] Point in Time Recovery

2004-07-09 Thread spock
On Tue, 6 Jul 2004, Zeugswetter Andreas SB SD wrote: > > Should we use a different datatype than time_t for the commit timestamp, > > one that offers more fine grained differentiation between checkpoints? > > Imho seconds is really sufficient. If you know a more precise position > you will probabl

Re: [HACKERS] Point in Time Recovery

2004-07-08 Thread Simon Riggs
On Thu, 2004-07-08 at 07:57, [EMAIL PROTECTED] wrote: > On Thu, 8 Jul 2004, Simon Riggs wrote: > > > We don't need to mention timelines in the docs, nor do we need to alter > > pg_controldata to display it...just a comment in the code to explain why > > we add a large number to the LogId after eac

Re: [HACKERS] Point in Time Recovery

2004-07-07 Thread Simon Riggs
On Wed, 2004-07-07 at 14:17, Zeugswetter Andreas SB SD wrote: > > Well, Tom does seem to have something with regard to StartUpIds. I feel > > it is easier to force a new timeline by adding a very large number to > > the LogId IF, and only if, we have performed an archive recovery. That > > way, we

Re: [HACKERS] Point in Time Recovery

2004-07-07 Thread Zeugswetter Andreas SB SD
> Well, Tom does seem to have something with regard to StartUpIds. I feel > it is easier to force a new timeline by adding a very large number to > the LogId IF, and only if, we have performed an archive recovery. That > way, we do not change at all the behaviour of the system for people that > ch

Re: [HACKERS] Point in Time Recovery

2004-07-06 Thread Simon Riggs
On Tue, 2004-07-06 at 20:00, Richard Huxton wrote: > Simon Riggs wrote: > > On Mon, 2004-07-05 at 22:46, Tom Lane wrote: > > > >>Simon Riggs <[EMAIL PROTECTED]> writes: > >> > >>>Should we use a different datatype than time_t for the commit timestamp, > >>>one that offers more fine grained differe

Re: [HACKERS] Point in Time Recovery

2004-07-06 Thread Simon Riggs
On Mon, 2004-07-05 at 22:46, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > - when we stop, keep reading records until EOF, just don't apply them. > > When we write a checkpoint at end of recovery, the unapplied > > transactions are buried alive, never to return. > > - stop where we

Re: [HACKERS] Point in Time Recovery

2004-07-06 Thread Simon Riggs
On Tue, 2004-07-06 at 08:38, Zeugswetter Andreas SB SD wrote: > > - by time - but the time stamp on each xlog record only specifies to the > > second, which could easily be 10 or more commits (we hope) > > > > Should we use a different datatype than time_t for the commit timestamp, > > one th

Re: [HACKERS] Point in Time Recovery

2004-07-06 Thread Richard Huxton
Simon Riggs wrote: On Mon, 2004-07-05 at 22:46, Tom Lane wrote: Simon Riggs <[EMAIL PROTECTED]> writes: Should we use a different datatype than time_t for the commit timestamp, one that offers more fine grained differentiation between checkpoints? Pretty much everybody supports gettimeofday() (time

Re: [HACKERS] Point in Time Recovery

2004-07-06 Thread Zeugswetter Andreas SB SD
> - by time - but the time stamp on each xlog record only specifies to the > second, which could easily be 10 or more commits (we hope) > > Should we use a different datatype than time_t for the commit timestamp, > one that offers more fine grained differentiation between checkpoints? Imho

  1   2   >