Re: [PERFORM] Recovery will take 10 hours

2006-04-24 Thread Simon Riggs
On Sun, 2006-04-23 at 22:46 -0600, Brendan Duddridge wrote: > So how do you overlap the restore process with the retrieving of files? The restore command can be *anything*. You just write a script... > Our restore command is: > > restore_command = 'gunzip %p' > > If I change it to: > > restor

Re: [PERFORM] Recovery will take 10 hours

2006-04-24 Thread Markus Schaber
Hi, Brandan, Brendan Duddridge wrote: > So how do you overlap the restore process with the retrieving of files? You need a shell script as restore command that does both uncompressing the current file, and starting a background decompress of the next file(s). It also has to check whether the cur

Re: [PERFORM] Recovery will take 10 hours

2006-04-23 Thread Brendan Duddridge
Hi Simon, The backup with 3120 WAL files was a 2 day old base backup. We've moved to a 1 day base backup now, but that'll still be 1600 WALs or so a day. That will probably take 5 hours to restore I suspect. Unless we move to 2 or more base backups per day. That seems crazy though. So how do you

Re: [PERFORM] Recovery will take 10 hours

2006-04-23 Thread Simon Riggs
On Thu, 2006-04-20 at 13:29 -0600, Brendan Duddridge wrote: > We had a database issue today that caused us to have to restore to > our most recent backup. We are using PITR so we have 3120 WAL files > that need to be applied to the database. How often are you taking base backups? > After 45

Re: [PERFORM] Recovery will take 10 hours

2006-04-20 Thread Gavin Sherry
On Thu, 20 Apr 2006, Brendan Duddridge wrote: > Hi Tomas, > > Hmm... ktrace -p PID -c returns immediately without doing anything > unless I've previously done a ktrace -p PID. > > According to the man page for ktrace's -c flag: >-c Clear the trace points associated with the specified file

Re: [PERFORM] Recovery will take 10 hours

2006-04-20 Thread Brendan Duddridge
Hi Tomas, Hmm... ktrace -p PID -c returns immediately without doing anything unless I've previously done a ktrace -p PID. According to the man page for ktrace's -c flag: -c Clear the trace points associated with the specified file or processes. When I run ktrace on OS X Server 10.4

Re: [PERFORM] Recovery will take 10 hours

2006-04-20 Thread Brendan Duddridge
Hi Tom, Well, we started the restore back up with the WAL archives copied to our local disk. It's going at about the same pace as with the restore over NFS. So I tried ktrace -p PID and it created a really big file. I had to do 'ktrace -p PID -c' to get it to stop. The ktrace.out file is

Re: [PERFORM] Recovery will take 10 hours

2006-04-20 Thread Brendan Duddridge
Thanks Tom, We are storing only the WAL archives on the NFS volume. It must have been a hiccup in the NFS mount. Jeff Frost asked if we were using hard or soft mounts. We were using soft mounts, so that may be where the problem lies with the PANIC. Is it better to use the boot volume of t

Re: [PERFORM] Recovery will take 10 hours

2006-04-20 Thread Tom Lane
Brendan Duddridge <[EMAIL PROTECTED]> writes: > Oops... forgot to mention that both files that postgres said were > missing are in fact there: Please place the blame where it should fall: it's your archive restore command that's telling postgres that. > There didn't seem to be any issues with t

Re: [PERFORM] Recovery will take 10 hours

2006-04-20 Thread Brendan Duddridge
Well our restore command is pretty basic: restore_command = 'gunzip %p' I'm not sure why that would succeed then fail. Brendan Duddridge | CTO | 403-277-5591 x24 | [EMAIL PROTECTED] ClickSpace Interactive Inc. Suite L100, 23

Re: [PERFORM] Recovery will take 10 hours

2006-04-20 Thread Jeff Frost
Brendan, Is your NFS share mounted hard or soft? Do you have space to copy the files locally? I suspect you're seeing NFS slowness in your restore since you aren't using much in the way of disk IO or CPU. -Jeff On Thu, 20 Apr 2006, Brendan Duddridge wrote: Oops... forgot to mention that

Re: [PERFORM] Recovery will take 10 hours

2006-04-20 Thread Tom Lane
Brendan Duddridge <[EMAIL PROTECTED]> writes: > However, as I just finished posting to the list, the process died > with a PANIC error: > [2006-04-20 16:41:28 MDT] LOG: restored log file > "0001018F0034" from archive > [2006-04-20 16:41:35 MDT] LOG: restored log file > "00010

Re: [PERFORM] Recovery will take 10 hours

2006-04-20 Thread Brendan Duddridge
Oops... forgot to mention that both files that postgres said were missing are in fact there: A partial listing from our wal_archive directory: -rw--- 1 postgres staff 4971129 Apr 19 20:08 0001018F0036.gz -rw--- 1 postgres staff 4378284 Apr 19 20:09 0001018F0

Re: [PERFORM] Recovery will take 10 hours

2006-04-20 Thread Brendan Duddridge
Hi Tom, I found it... it's called ktrace on OS X Server. However, as I just finished posting to the list, the process died with a PANIC error: [2006-04-20 16:41:28 MDT] LOG: restored log file "0001018F0034" from archive [2006-04-20 16:41:35 MDT] LOG: restored log file "00

Re: [PERFORM] Recovery will take 10 hours

2006-04-20 Thread Brendan Duddridge
Hi Jeff, The WAL files are stored on a separate server and accessed through an NFS mount located at /wal_archive. However, the restore failed about 5 hours in after we got this error: [2006-04-20 16:41:28 MDT] LOG: restored log file "0001018F0034" from archive [2006-04-20 16:41

Re: [PERFORM] Recovery will take 10 hours

2006-04-20 Thread Jeff Frost
On Thu, 20 Apr 2006, Brendan Duddridge wrote: Hi, We had a database issue today that caused us to have to restore to our most recent backup. We are using PITR so we have 3120 WAL files that need to be applied to the database. After 45 minutes, it has restored only 230 WAL files. At this rat

Re: [PERFORM] Recovery will take 10 hours

2006-04-20 Thread Luke Lonergan
Title: Re: [PERFORM] Recovery will take 10 hours Brendan,   strace –p -c Then do a “CTRL-C” after a minute to get the stats of system calls. - Luke On 4/20/06 2:13 PM, "Brendan Duddridge" <[EMAIL PROTECTED]> wrote: Hi Tom, Do you mean do a kill -QUIT on the postgres pro

Re: [PERFORM] Recovery will take 10 hours

2006-04-20 Thread Tom Lane
Brendan Duddridge <[EMAIL PROTECTED]> writes: > Do you mean do a kill -QUIT on the postgres process in order to > generate a stack trace? Not at all! I'm talking about tracing the kernel calls it's making. Depending on your platform, the tool for this is called strace, ktrace, truss, or maybe e

Re: [PERFORM] Recovery will take 10 hours

2006-04-20 Thread Brendan Duddridge
Hi Tom, Do you mean do a kill -QUIT on the postgres process in order to generate a stack trace? Will that affect the currently running process in any bad way? And where would the output go? stdout? Thanks, Brendan Duddr

Re: [PERFORM] Recovery will take 10 hours

2006-04-20 Thread Tom Lane
Brendan Duddridge <[EMAIL PROTECTED]> writes: > We had a database issue today that caused us to have to restore to > our most recent backup. We are using PITR so we have 3120 WAL files > that need to be applied to the database. > After 45 minutes, it has restored only 230 WAL files. At this rat

[PERFORM] Recovery will take 10 hours

2006-04-20 Thread Brendan Duddridge
Hi, We had a database issue today that caused us to have to restore to our most recent backup. We are using PITR so we have 3120 WAL files that need to be applied to the database. After 45 minutes, it has restored only 230 WAL files. At this rate, it's going to take about 10 hours to rest