Simon Riggs wrote:
On Thu, 2009-02-26 at 20:38 +0200, Heikki Linnakangas wrote:
I think we should simply remove the signal handler for SIGQUIT from
pg_standby.
If you do this, please make it release dependent so pg_standby behaves
correctly for the release it is being used with.
Hmm, I don'
On Thu, 2009-02-26 at 20:38 +0200, Heikki Linnakangas wrote:
> I think we should simply remove the signal handler for SIGQUIT from
> pg_standby.
If you do this, please make it release dependent so pg_standby behaves
correctly for the release it is being used with.
--
Simon Riggs ww
Hi,
On Fri, Feb 27, 2009 at 3:38 AM, Heikki Linnakangas
wrote:
> I think the real problem here is that pg_standby traps SIGQUIT. The startup
> process doesn't receive the SIGQUIT because it's in system(), and pg_standby
> doesn't propagate it to the startup process either because it traps it.
Ye
Fujii Masao wrote:
On Fri, Jan 30, 2009 at 7:47 PM, Simon Riggs wrote:
That whole area was something I was leaving until last, since immediate
shutdown doesn't work either, even in HEAD. (Fujii-san and I discussed
this before Christmas, briefly).
This problem remains in current HEAD. I mean,
Hi,
On Fri, Jan 30, 2009 at 7:47 PM, Simon Riggs wrote:
> That whole area was something I was leaving until last, since immediate
> shutdown doesn't work either, even in HEAD. (Fujii-san and I discussed
> this before Christmas, briefly).
This problem remains in current HEAD. I mean, immediate sh
On Wed, 2009-02-18 at 18:01 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Wed, 2009-02-18 at 14:26 +0200, Heikki Linnakangas wrote:
> >
> >> The outer "if" should ensure that it isn't printed repeatedly on an idle
> >> system.
> >
> > Regrettably not.
>
> Ok, committed.
Cool.
Simon Riggs wrote:
On Wed, 2009-02-18 at 14:26 +0200, Heikki Linnakangas wrote:
The outer "if" should ensure that it isn't printed repeatedly on an idle
system.
Regrettably not.
Ok, committed. I fixed that and some comment changes. I also renamed
IsRecoveryProcessingMode() to RecoveryInPr
On Wed, 2009-02-18 at 14:26 +0200, Heikki Linnakangas wrote:
> The outer "if" should ensure that it isn't printed repeatedly on an idle
> system.
Regrettably not.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
--
Sent via pgsql-hackers mailing lis
Simon Riggs wrote:
On Mon, 2009-02-09 at 17:13 +0200, Heikki Linnakangas wrote:
Attached is an updated patch that does that, and I've fixed all the
other outstanding issues I listed earlier as well. Now I'm feeling
again that this is in pretty good shape.
UpdateMinRecoveryPoint() issues a DEB
On Mon, 2009-02-09 at 17:13 +0200, Heikki Linnakangas wrote:
> Attached is an updated patch that does that, and I've fixed all the
> other outstanding issues I listed earlier as well. Now I'm feeling
> again that this is in pretty good shape.
UpdateMinRecoveryPoint() issues a DEBUG2 message eve
On Fri, 2009-02-06 at 10:06 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Thu, 2009-02-05 at 21:54 +0200, Heikki Linnakangas wrote:
> >> - If you perform a fast shutdown while startup process is waiting for
> >> the restore command, startup process sometimes throws a FATAL error
>
Simon Riggs wrote:
On Thu, 2009-02-05 at 21:54 +0200, Heikki Linnakangas wrote:
- If you perform a fast shutdown while startup process is waiting for
the restore command, startup process sometimes throws a FATAL error
which leads escalates into an immediate shutdown. That leads to
different me
On Thu, 2009-02-05 at 21:54 +0200, Heikki Linnakangas wrote:
> - If bgwriter is performing a restartpoint when recovery ends, the
> startup checkpoint will be queued up behind the restartpoint. And since
> it uses the same smoothing logic as checkpoints, it can take quite some
> time for that
On Thu, 2009-02-05 at 14:18 +0200, Heikki Linnakangas wrote:
> when the control file is updated in XLogFlush, it's
> typically the bgwriter doing it as it cleans buffers ahead of the
> clock hand, not the startup process
That is the key point. Let's do it your way.
--
Simon Riggs w
Simon Riggs wrote:
On Thu, 2009-02-05 at 13:18 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
On Thu, 2009-02-05 at 11:46 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
So we might end up flushing more often *and* we will be doing it
potentially in the code path of other users.
For
On Thu, 2009-02-05 at 13:18 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Thu, 2009-02-05 at 11:46 +0200, Heikki Linnakangas wrote:
> >> Simon Riggs wrote:
> >
> >>> So we might end up flushing more often *and* we will be doing it
> >>> potentially in the code path of other users.
Simon Riggs wrote:
On Thu, 2009-02-05 at 11:46 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
So we might end up flushing more often *and* we will be doing it
potentially in the code path of other users.
For example, imagine a database that fits completely in shared buffers.
If we updat
On Thu, 2009-02-05 at 11:46 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > So we might end up flushing more often *and* we will be doing it
> > potentially in the code path of other users.
>
> For example, imagine a database that fits completely in shared buffers.
> If we update at e
Simon Riggs wrote:
On Thu, 2009-02-05 at 10:31 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
On Thu, 2009-02-05 at 09:28 +0200, Heikki Linnakangas wrote:
I got rid of minSafeStartPoint, advancing minRecoveryPoint instead. And
it's advanced in XLogFlush instead of XLogFileRead. I'll post
On Thu, 2009-02-05 at 09:26 +, Simon Riggs wrote:
> This change seems speculative and also against what has previously been
> agreed with Tom. If he chooses not to comment on your changes, that's up
> to him, but I don't think you should remove things quietly that have
> been put there throug
On Thu, 2009-02-05 at 10:31 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Thu, 2009-02-05 at 09:28 +0200, Heikki Linnakangas wrote:
> >
> I've changed the way minRecoveryPoint is updated now anyway, so it no
> longer happens every XLogFileRead().
> >>> Care to elucidate?
On Thu, 2009-02-05 at 10:07 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Thu, 2009-02-05 at 09:28 +0200, Heikki Linnakangas wrote:
> >> Simon Riggs wrote:
> >>> I would suggest that at end of recovery we write the last LSN to the
> >>> control file, so if we crash recover then we w
Simon Riggs wrote:
On Thu, 2009-02-05 at 09:28 +0200, Heikki Linnakangas wrote:
I've changed the way minRecoveryPoint is updated now anyway, so it no
longer happens every XLogFileRead().
Care to elucidate?
I got rid of minSafeStartPoint, advancing minRecoveryPoint instead. And
it's advanced i
Simon Riggs wrote:
On Thu, 2009-02-05 at 09:28 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
I would suggest that at end of recovery we write the last LSN to the
control file, so if we crash recover then we will always end archive
recovery at the same place again should we re-enter it. So
On Thu, 2009-02-05 at 09:28 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> >> We could avoid that by performing a good old startup checkpoint, but I
> >> quite like the fast failover time we get without it.
> >
> > ISTM it's either slow failover or (fast failover, but restart archive
>
On Thu, 2009-02-05 at 09:28 +0200, Heikki Linnakangas wrote:
> >> I've changed the way minRecoveryPoint is updated now anyway, so it no
> >> longer happens every XLogFileRead().
> >
> > Care to elucidate?
>
> I got rid of minSafeStartPoint, advancing minRecoveryPoint instead. And
> it's advan
Simon Riggs wrote:
We could avoid that by performing a good old startup checkpoint, but I
quite like the fast failover time we get without it.
ISTM it's either slow failover or (fast failover, but restart archive
recovery if crashes).
I would suggest that at end of recovery we write the last L
Tom Lane wrote:
Fujii Masao writes:
On Wed, Feb 4, 2009 at 8:35 PM, Heikki Linnakangas
wrote:
... I'm not sure why we in CVS HEAD we don't reset
FatalError until after the startup process is finished.
Which may repeat the recovery crash and reinitializing forever. To prevent
this problem,
Fujii Masao writes:
> On Wed, Feb 4, 2009 at 8:35 PM, Heikki Linnakangas
> wrote:
>> ... I'm not sure why we in CVS HEAD we don't reset
>> FatalError until after the startup process is finished.
> Which may repeat the recovery crash and reinitializing forever. To prevent
> this problem, unexpect
Hi,
On Wed, Feb 4, 2009 at 8:35 PM, Heikki Linnakangas
wrote:
> Yes, and in fact I ran into it myself yesterday while testing. It seems that
> we should reset FatalError earlier, ie. when the recovery starts and
> bgwriter is launched. I'm not sure why we in CVS HEAD we don't reset
> FatalError u
On Wed, 2009-02-04 at 19:03 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > * I think we are now renaming the recovery.conf file too early. The
> > comment says "We have already restored all the WAL segments we need from
> > the archive, and we trust that they are not going to go away ev
Simon Riggs wrote:
* I think we are now renaming the recovery.conf file too early. The
comment says "We have already restored all the WAL segments we need from
the archive, and we trust that they are not going to go away even if we
crash." We have, but the files overwrite each other as they arriv
Fujii Masao wrote:
On Fri, Jan 30, 2009 at 11:55 PM, Heikki Linnakangas
wrote:
The startup process now catches SIGTERM, and calls proc_exit() at the next
WAL record. That's what will happen in a fast shutdown. Unexpected death of
the startup process is treated the same as a backend/auxiliary pr
Hi,
On Fri, Jan 30, 2009 at 11:55 PM, Heikki Linnakangas
wrote:
> The startup process now catches SIGTERM, and calls proc_exit() at the next
> WAL record. That's what will happen in a fast shutdown. Unexpected death of
> the startup process is treated the same as a backend/auxiliary process
> cra
On Sat, 2009-01-31 at 22:41 +0200, Heikki Linnakangas wrote:
> > I like this way because it means we might in the future get Startup
> > process to perform post-recovery actions also.
>
> Yeah, it does. Do you have something in mind already?
Yes, but nothing that needs to be discussed yet.
--
On Sat, 2009-01-31 at 22:32 +0200, Heikki Linnakangas wrote:
> If you poison your WAL archive with a XLOG_CRASH_RECOVERY record,
> recovery will never be able to proceed over that point. There would have
> to be a switch to ignore those records, at the very least.
Definitely in assert mode onl
Simon Riggs wrote:
On Fri, 2009-01-30 at 16:55 +0200, Heikki Linnakangas wrote:
Ok, here's an attempt to make shutdown work gracefully.
Startup process now signals postmaster three times during startup: first
when it has done all the initialization, and starts redo. At that point.
postmaster
Simon Riggs wrote:
On Fri, 2009-01-30 at 13:15 +0200, Heikki Linnakangas wrote:
Simon Riggs wrote:
I'm thinking to add a new function that will allow crash testing easier.
pg_crash_standby() will issue a new xlog record, XLOG_CRASH_STANDBY,
which when replayed will just throw a FATAL error and
On Fri, 2009-01-30 at 16:55 +0200, Heikki Linnakangas wrote:
> Ok, here's an attempt to make shutdown work gracefully.
>
> Startup process now signals postmaster three times during startup: first
> when it has done all the initialization, and starts redo. At that point.
> postmaster launches bg
On Fri, 2009-01-30 at 13:25 +0200, Heikki Linnakangas wrote:
> > That whole area was something I was leaving until last, since
> immediate
> > shutdown doesn't work either, even in HEAD. (Fujii-san and I
> discussed
> > this before Christmas, briefly).
>
> We must handle shutdown gracefully, can'
On Fri, 2009-01-30 at 13:15 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > I'm thinking to add a new function that will allow crash testing easier.
> >
> > pg_crash_standby() will issue a new xlog record, XLOG_CRASH_STANDBY,
> > which when replayed will just throw a FATAL error and cra
Simon Riggs wrote:
On Thu, 2009-01-29 at 19:20 +0200, Heikki Linnakangas wrote:
Hmm, seems like we haven't thought through how shutdown during
consistent recovery is supposed to behave in general. Right now, smart
shutdown doesn't do anything during consistent recovery, because the
startup pro
Simon Riggs wrote:
I'm thinking to add a new function that will allow crash testing easier.
pg_crash_standby() will issue a new xlog record, XLOG_CRASH_STANDBY,
which when replayed will just throw a FATAL error and crash Startup
process. We won't be adding that to the user docs...
This will all
On Thu, 2009-01-29 at 14:21 +0200, Heikki Linnakangas wrote:
> It looks like if you issue a fast shutdown during recovery, postmaster
> doesn't kill bgwriter.
Thanks for the report.
I'm thinking to add a new function that will allow crash testing easier.
pg_crash_standby() will issue a new xlo
On Fri, 2009-01-30 at 11:33 +0200, Heikki Linnakangas wrote:
> I just realized that the new minSafeStartPoint is actually exactly the
> same concept as the existing minRecoveryPoint. As the recovery
> progresses, we could advance minRecoveryPoint just as well as the new
> minSafeStartPoint.
>
On Thu, 2009-01-29 at 19:20 +0200, Heikki Linnakangas wrote:
> Heikki Linnakangas wrote:
> > It looks like if you issue a fast shutdown during recovery, postmaster
> > doesn't kill bgwriter.
>
> Hmm, seems like we haven't thought through how shutdown during
> consistent recovery is supposed to
On Thu, 2009-01-29 at 20:35 +0200, Heikki Linnakangas wrote:
> Hmm, another point of consideration is how this interacts with the
> pause/continue. In particular, it was suggested earlier that you
> could
> put an option into recovery.conf to start in paused mode. If you
> pause
> recovery, and
I just realized that the new minSafeStartPoint is actually exactly the
same concept as the existing minRecoveryPoint. As the recovery
progresses, we could advance minRecoveryPoint just as well as the new
minSafeStartPoint.
Perhaps it's a good idea to keep them separate anyway though, the
orig
Heikki Linnakangas wrote:
Simon Riggs wrote:
On Thu, 2009-01-29 at 15:31 +0200, Heikki Linnakangas wrote:
Now when we restart the recovery, we will never reach
minSafeStartPoint, which is now 0/400, and we'll fail with the
error that Fujii-san pointed out. We're already way past the min
re
Heikki Linnakangas wrote:
It looks like if you issue a fast shutdown during recovery, postmaster
doesn't kill bgwriter.
Hmm, seems like we haven't thought through how shutdown during
consistent recovery is supposed to behave in general. Right now, smart
shutdown doesn't do anything during con
Simon Riggs wrote:
On Thu, 2009-01-29 at 15:31 +0200, Heikki Linnakangas wrote:
Now when we restart the recovery, we will never reach
minSafeStartPoint, which is now 0/400, and we'll fail with the
error that Fujii-san pointed out. We're already way past the min
recovery point of base backup
On Thu, 2009-01-29 at 15:31 +0200, Heikki Linnakangas wrote:
> Now when we restart the recovery, we will never reach
> minSafeStartPoint, which is now 0/400, and we'll fail with the
> error that Fujii-san pointed out. We're already way past the min
> recovery point of base backup by then.
Th
Simon Riggs wrote:
On Thu, 2009-01-29 at 12:22 +0200, Heikki Linnakangas wrote:
It
comes from the fact that we set minSafeStartPoint beyond the actual end
of WAL, if the last WAL segment is only partially filled (= fails CRC
check at some point). If we crash after setting minSafeStartPoint lik
On Thu, 2009-01-29 at 12:22 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > My proposed fix for Fujii-san's minSafeStartPoint bug is to introduce
> > another control file state DB_IN_ARCHIVE_RECOVERY_BASE. This would show
> > that we are still recovering up to the point of the end of the
It looks like if you issue a fast shutdown during recovery, postmaster
doesn't kill bgwriter.
...
LOG: restored log file "00010028" from archive
LOG: restored log file "00010029" from archive
LOG: consistent recovery state reached at 0/295C
...
LOG: restor
Simon Riggs wrote:
My proposed fix for Fujii-san's minSafeStartPoint bug is to introduce
another control file state DB_IN_ARCHIVE_RECOVERY_BASE. This would show
that we are still recovering up to the point of the end of the base
backup. Once we reach minSafeStartPoint we then switch state to
DB_I
On Thu, 2009-01-29 at 11:20 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Thu, 2009-01-29 at 10:36 +0900, Fujii Masao wrote:
> >> Hi,
> >>
> >> On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao
> >> wrote:
> I feel quite good about this patch now. Given the amount of code churn,
Simon Riggs wrote:
On Thu, 2009-01-29 at 10:36 +0900, Fujii Masao wrote:
Hi,
On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao wrote:
I feel quite good about this patch now. Given the amount of code churn, it
requires testing, and I'll read it through one more time after sleeping over
it. Simon,
On Thu, 2009-01-29 at 09:34 +0200, Heikki Linnakangas wrote:
> It does *during recovery*, before InitXLogAccess is called. Yeah, it's
> harmless currently. It would be pretty hard to keep it up-to-date in
> bgwriter and other processes. I think it's better to keep it at 0,
> which is clearly an
Simon Riggs wrote:
On Thu, 2009-01-29 at 12:18 +0900, Fujii Masao wrote:
Though this is a matter of taste, I think that it's weird that bgwriter
runs with ThisTimeLineID = 0 during recovery. This is because
XLogCtl->ThisTimeLineID is set at the end of recovery. ISTM this will
be a cause of bug i
On Thu, 2009-01-29 at 10:36 +0900, Fujii Masao wrote:
> Hi,
>
> On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao wrote:
> >> I feel quite good about this patch now. Given the amount of code churn, it
> >> requires testing, and I'll read it through one more time after sleeping
> >> over
> >> it. Si
On Thu, 2009-01-29 at 12:18 +0900, Fujii Masao wrote:
> Hi,
>
> On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao wrote:
> >> I feel quite good about this patch now. Given the amount of code churn, it
> >> requires testing, and I'll read it through one more time after sleeping
> >> over
> >> it. Si
Hi,
On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao wrote:
>> I feel quite good about this patch now. Given the amount of code churn, it
>> requires testing, and I'll read it through one more time after sleeping over
>> it. Simon, do you see anything wrong with this?
>
> I also read this patch and
Hi,
On Wed, Jan 28, 2009 at 11:19 PM, Fujii Masao wrote:
>> I feel quite good about this patch now. Given the amount of code churn, it
>> requires testing, and I'll read it through one more time after sleeping over
>> it. Simon, do you see anything wrong with this?
>
> I also read this patch and
Fujii Masao wrote:
On Wed, Jan 28, 2009 at 7:04 PM, Heikki Linnakangas
wrote:
I feel quite good about this patch now. Given the amount of code churn, it
requires testing, and I'll read it through one more time after sleeping over
it. Simon, do you see anything wrong with this?
I also read thi
On Wed, 2009-01-28 at 23:54 +0900, Fujii Masao wrote:
> >> Why is InitXLOGAccess() called also here when bgwriter is started after
> >> recovery? That is already called by AuxiliaryProcessMain().
> >
> > InitXLOGAccess() sets the timeline and also gets the latest record
> > pointer. If the bgwrite
Hi,
On Wed, Jan 28, 2009 at 11:47 PM, Simon Riggs wrote:
>
> On Wed, 2009-01-28 at 23:19 +0900, Fujii Masao wrote:
>
>> > @@ -355,6 +359,27 @@ BackgroundWriterMain(void)
>> > */
>> > PG_SETMASK(&UnBlockSig);
>> >
>> > + BgWriterRecoveryMode = IsRecoveryProcessingMode();
>> > +
>> > +
On Wed, 2009-01-28 at 23:19 +0900, Fujii Masao wrote:
> > @@ -355,6 +359,27 @@ BackgroundWriterMain(void)
> > */
> > PG_SETMASK(&UnBlockSig);
> >
> > + BgWriterRecoveryMode = IsRecoveryProcessingMode();
> > +
> > + if (BgWriterRecoveryMode)
> > + elog(DEBUG1, "bgwriter star
Hi,
On Wed, Jan 28, 2009 at 7:04 PM, Heikki Linnakangas
wrote:
> I've been reviewing and massaging the so called recovery infra patch.
Great!
> I feel quite good about this patch now. Given the amount of code churn, it
> requires testing, and I'll read it through one more time after sleeping ov
On Wed, 2009-01-28 at 12:04 +0200, Heikki Linnakangas wrote:
> I've been reviewing and massaging the so called recovery infra patch.
Thanks.
> I feel quite good about this patch now. Given the amount of code
> churn, it requires testing, and I'll read it through one more time
> after sleeping ov
I've been reviewing and massaging the so called recovery infra patch.
To recap, the goal is to:
- start background writer during (archive) recovery
- skip the shutdown checkpoint at the end of recovery. Instead, the
database is brought up immediately, and the bgwriter performs a normal
online c
71 matches
Mail list logo