Andres Freund writes:
> What I dislike with what you committed is that the state you're
> investigating during the pause isn't the one youre going to end up
> recoveryApply == true. That seems dangerous to me, even if its going to
> be reworked in HEAD.
Agreed, but it's been like that since the p
On Wed, Dec 5, 2012 at 11:17 AM, Tom Lane wrote:
> Jeff Janes writes:
>> Right now if I'm doing a PITR and want to look around before blessing
>> the restore, I have to:
>> [ do painful stuff ]
>
> Yeah. The worst thing about this is the cost of stepping too far
> forward, but I doubt we can do
On 2012-12-05 18:35:47 -0500, Tom Lane wrote:
> Andres Freund writes:
> > On 2012-12-05 16:15:38 -0500, Tom Lane wrote:
> >> That's fine, but the immediate question is what are we doing to fix
> >> the back branches. I think everyone is clear that we should be testing
> >> LocalHotStandbyActive r
On 5 December 2012 22:23, Tom Lane wrote:
> Robert Haas writes:
>> On Wed, Dec 5, 2012 at 4:15 PM, Tom Lane wrote:
>>> The argument for this is that although we might fetch a slightly stale
>>> value of the shared variable, it can't be very stale --- certainly no
>>> older than the spinlock acqu
Andres Freund writes:
> On 2012-12-05 16:15:38 -0500, Tom Lane wrote:
>> That's fine, but the immediate question is what are we doing to fix
>> the back branches. I think everyone is clear that we should be testing
>> LocalHotStandbyActive rather than precursor conditions to see if a pause
>> is
Robert Haas writes:
> On Wed, Dec 5, 2012 at 4:15 PM, Tom Lane wrote:
>> The argument for this is that although we might fetch a slightly stale
>> value of the shared variable, it can't be very stale --- certainly no
>> older than the spinlock acquisition near the bottom of the previous
>> iterat
On Wed, Dec 5, 2012 at 4:15 PM, Tom Lane wrote:
> The argument for this is that although we might fetch a slightly stale
> value of the shared variable, it can't be very stale --- certainly no
> older than the spinlock acquisition near the bottom of the previous
> iteration of the loop. And this
On 2012-12-05 16:15:38 -0500, Tom Lane wrote:
> Simon Riggs writes:
> > On 5 December 2012 18:48, Tom Lane wrote:
> >> On further thought, it seems like recovery_pause_at_target is rather
> >> misdesigned anyway, and taking recovery target parameters from
> >> recovery.conf is an obsolete API tha
Simon Riggs writes:
> Yep, thats fine.
> Are you doing this or do you want me to? Don't mind either way.
I've got a patch for most of it already, so happy to do it.
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to yo
On 5 December 2012 21:15, Tom Lane wrote:
> Simon Riggs writes:
>> On 5 December 2012 18:48, Tom Lane wrote:
>>> On further thought, it seems like recovery_pause_at_target is rather
>>> misdesigned anyway, and taking recovery target parameters from
>>> recovery.conf is an obsolete API that was d
Simon Riggs writes:
> On 5 December 2012 18:48, Tom Lane wrote:
>> On further thought, it seems like recovery_pause_at_target is rather
>> misdesigned anyway, and taking recovery target parameters from
>> recovery.conf is an obsolete API that was designed in a world before hot
>> standby. What I
On 5 December 2012 18:48, Tom Lane wrote:
> I wrote:
>> Andres Freund writes:
>>> On 2012-12-05 17:24:42 +, Simon Riggs wrote:
So ISTM that we should make recoveryStopsHere() return false while we
are inconsistent. Problems solved.
>
>>> I prefer the previous (fixed) behaviour where
Jeff Janes writes:
> Right now if I'm doing a PITR and want to look around before blessing
> the restore, I have to:
> [ do painful stuff ]
Yeah. The worst thing about this is the cost of stepping too far
forward, but I doubt we can do much about that --- WAL isn't reversible
and I can't see us
On 2012-12-05 13:48:53 -0500, Tom Lane wrote:
> I wrote:
> > Andres Freund writes:
> >> On 2012-12-05 17:24:42 +, Simon Riggs wrote:
> >>> So ISTM that we should make recoveryStopsHere() return false while we
> >>> are inconsistent. Problems solved.
>
> >> I prefer the previous (fixed) behavio
On Wed, Dec 5, 2012 at 8:40 AM, Tom Lane wrote:
> The real question here probably needs to be "what is the point of
> recoveryPauseAtTarget in the first place?". I find it hard to envision
> what's the point of pausing unless the user has an opportunity to
> make a decision about whether to cont
I wrote:
> Andres Freund writes:
>> On 2012-12-05 17:24:42 +, Simon Riggs wrote:
>>> So ISTM that we should make recoveryStopsHere() return false while we
>>> are inconsistent. Problems solved.
>> I prefer the previous (fixed) behaviour where we error out if we reach a
>> recovery target befo
Andres Freund writes:
> On 2012-12-05 17:24:42 +, Simon Riggs wrote:
>> So ISTM that we should make recoveryStopsHere() return false while we
>> are inconsistent. Problems solved.
> I prefer the previous (fixed) behaviour where we error out if we reach a
> recovery target before we are consis
On 2012-12-05 17:24:42 +, Simon Riggs wrote:
> On 5 December 2012 17:17, Simon Riggs wrote:
>
> > The recovery target and the consistency point are in some ways in
> > conflict. If the recovery target is before the consistency point there
> > is no point in stopping there, whether or not we pa
On 5 December 2012 17:17, Simon Riggs wrote:
> The recovery target and the consistency point are in some ways in
> conflict. If the recovery target is before the consistency point there
> is no point in stopping there, whether or not we pause. What we should
> do is say "recovery target reached,
On 5 December 2012 16:40, Tom Lane wrote:
> The real question here probably needs to be "what is the point of
> recoveryPauseAtTarget in the first place?". I find it hard to envision
> what's the point of pausing unless the user has an opportunity to
> make a decision about whether to continue a
On 2012-12-05 18:08:01 +0100, Andres Freund wrote:
> On 2012-12-05 11:40:16 -0500, Tom Lane wrote:
> > Andres Freund writes:
> > > Basically the whole logical arround recoveryApply seems to be broken
> > > currently. Because if recoveryApply=false we currently don't pause at
> > > all because we j
On 2012-12-05 11:40:16 -0500, Tom Lane wrote:
> Andres Freund writes:
> > Basically the whole logical arround recoveryApply seems to be broken
> > currently. Because if recoveryApply=false we currently don't pause at
> > all because we jump out of the apply loop with the break.
>
> Huh? That brea
Andres Freund writes:
> Basically the whole logical arround recoveryApply seems to be broken
> currently. Because if recoveryApply=false we currently don't pause at
> all because we jump out of the apply loop with the break.
Huh? That break is after the pause:
/*
On 2012-12-05 11:11:23 -0500, Tom Lane wrote:
> Andres Freund writes:
> > On 2012-12-05 13:34:05 +, Simon Riggs wrote:
> >> @@ -5883,6 +5889,17 @@ StartupXLOG(void)
> >> } while (record != NULL && recoveryContinue);
> >>
> >> /*
> >> + * We've reached stop point, but not yet
Andres Freund writes:
> On 2012-12-05 13:34:05 +, Simon Riggs wrote:
>> @@ -5883,6 +5889,17 @@ StartupXLOG(void)
>> } while (record != NULL && recoveryContinue);
>>
>> /*
>> + * We've reached stop point, but not yet applied last
>> + * record. Pause AFT
On 5 December 2012 14:33, Andres Freund wrote:
> Independent of this patch, I am slightly confused about the whole stop
> logic. Isn't the idea that you can stop/start/stop/start/... recovery?
> Because if !recoveryApply we break out of the whole recovery loop and
> are done with things.
You can
On 2012-12-05 14:33:36 +, Simon Riggs wrote:
> On 5 December 2012 13:34, Simon Riggs wrote:
>
> > Aboriginal bug extends back to 9.0.
>
> I don't see any bug in 9.0 and 9.1, just 9.2+
Well the pausing logic is clearly broken in 9.1 as well, isn't it?
I.e. you will get:
LOG: recovery has paus
On 5 December 2012 13:34, Simon Riggs wrote:
> Aboriginal bug extends back to 9.0.
I don't see any bug in 9.0 and 9.1, just 9.2+
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-bugs mailing list (pgsq
On 2012-12-05 13:34:05 +, Simon Riggs wrote:
> On 5 December 2012 02:27, Tom Lane wrote:
> > Andres Freund writes:
> >>> But the key is, the database was not actually consistent at that
> >>> point, and so opening hot standby was a dangerous thing to do.
> >>>
> >>> The bug that allowed the d
Andres Freund writes:
> On 2012-12-05 19:06:55 +0900, Tatsuo Ishii wrote:
>> So what status are we on? Are we going to release 9.2.2 as it is?
>> Or withdraw current 9.2.2?
> Releasing as-is sounds good. As Tom wrote upthread:
> On 2012-12-04 21:27:34 -0500, Tom Lane wrote:
>> This is not a regr
On 5 December 2012 02:27, Tom Lane wrote:
> Andres Freund writes:
>>> But the key is, the database was not actually consistent at that
>>> point, and so opening hot standby was a dangerous thing to do.
>>>
>>> The bug that allowed the database to open early (the original topic if
>>> this email c
On 2012-12-05 19:06:55 +0900, Tatsuo Ishii wrote:
> So what status are we on? Are we going to release 9.2.2 as it is?
> Or withdraw current 9.2.2?
Releasing as-is sounds good. As Tom wrote upthread:
On 2012-12-04 21:27:34 -0500, Tom Lane wrote:
> This is not a regression because the pause logic i
So what status are we on? Are we going to release 9.2.2 as it is?
Or withdraw current 9.2.2?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
> Andres Freund writes:
>> On 2012-12-04 21:27:34 -0500, Tom Lane wrote:
>>> So the upsh
On 5 December 2012 00:35, Tom Lane wrote:
> I wrote:
>> So apparently this is something we broke since Nov 18. Don't know what
>> yet --- any thoughts?
>
> Further experimentation shows that reverting commit
> ffc3172e4e3caee0327a7e4126b5e7a3c8a1c8cf makes it work. So there's
> something wrong/i
Andres Freund writes:
> On 2012-12-04 21:27:34 -0500, Tom Lane wrote:
>> So the upshot is that I propose a patch more like the attached.
> Without having run anything so far it looks good to me.
BTW, while on the theme of the pause feature being several bricks shy of
a load, it looks to me like
On 2012-12-04 21:27:34 -0500, Tom Lane wrote:
> Andres Freund writes:
> >> But the key is, the database was not actually consistent at that
> >> point, and so opening hot standby was a dangerous thing to do.
> >>
> >> The bug that allowed the database to open early (the original topic if
> >> this
Andres Freund writes:
>> But the key is, the database was not actually consistent at that
>> point, and so opening hot standby was a dangerous thing to do.
>>
>> The bug that allowed the database to open early (the original topic if
>> this email chain) was masking this secondary issue.
> Could
On 2012-12-04 18:05:15 -0800, Jeff Janes wrote:
> On Tue, Dec 4, 2012 at 4:20 PM, Tom Lane wrote:
> > Jeff Janes writes:
> >> I've reproduced it again using the just-tagged 9.2.2, and uploaded a
> >> 135MB tarball of the /tmp/data_slave2 and /tmp/archivedir to google
> >> drive. The data directo
On Tue, Dec 4, 2012 at 4:35 PM, Tom Lane wrote:
> I wrote:
>> So apparently this is something we broke since Nov 18. Don't know what
>> yet --- any thoughts?
>
> Further experimentation shows that reverting commit
> ffc3172e4e3caee0327a7e4126b5e7a3c8a1c8cf makes it work. So there's
> something w
On 2012-12-04 19:20:44 -0500, Tom Lane wrote:
> Jeff Janes writes:
> > I've reproduced it again using the just-tagged 9.2.2, and uploaded a
> > 135MB tarball of the /tmp/data_slave2 and /tmp/archivedir to google
> > drive. The data directory contains the recovery.conf which is set to
> > end reco
On Tue, Dec 4, 2012 at 4:20 PM, Tom Lane wrote:
> Jeff Janes writes:
>> I've reproduced it again using the just-tagged 9.2.2, and uploaded a
>> 135MB tarball of the /tmp/data_slave2 and /tmp/archivedir to google
>> drive. The data directory contains the recovery.conf which is set to
>> end recov
On 2012-12-04 19:35:48 -0500, Tom Lane wrote:
> I wrote:
> > So apparently this is something we broke since Nov 18. Don't know what
> > yet --- any thoughts?
>
> Further experimentation shows that reverting commit
> ffc3172e4e3caee0327a7e4126b5e7a3c8a1c8cf makes it work. So there's
> something wr
I wrote:
> So apparently this is something we broke since Nov 18. Don't know what
> yet --- any thoughts?
Further experimentation shows that reverting commit
ffc3172e4e3caee0327a7e4126b5e7a3c8a1c8cf makes it work. So there's
something wrong/incomplete about that fix.
This is a bit urgent since
Jeff Janes writes:
> I've reproduced it again using the just-tagged 9.2.2, and uploaded a
> 135MB tarball of the /tmp/data_slave2 and /tmp/archivedir to google
> drive. The data directory contains the recovery.conf which is set to
> end recovery between the two critical time points.
Hmmm ... I c
On Sun, Dec 2, 2012 at 1:02 PM, Tom Lane wrote:
> Jeff Janes writes:
>> On Sat, Dec 1, 2012 at 1:56 PM, Tom Lane wrote:
>>> I'm confused. Are you now saying that this problem only exists in
>>> 9.1.x? I tested current HEAD because you indicated the problem was
>>> still there.
>
>> No, I'm say
Jeff Janes writes:
> On Sat, Dec 1, 2012 at 1:56 PM, Tom Lane wrote:
>> I'm confused. Are you now saying that this problem only exists in
>> 9.1.x? I tested current HEAD because you indicated the problem was
>> still there.
> No, I'm saying the problem exists both in 9.1.x and in hypothetical
On Sat, Dec 1, 2012 at 1:56 PM, Tom Lane wrote:
> Jeff Janes writes:
>> On Sat, Dec 1, 2012 at 12:47 PM, Tom Lane wrote:
>>> Jeff Janes writes:
In the newly fixed 9_2_STABLE, that problem still shows up the same as
it does in 9.1.6.
>
>>> I tried to reproduce this as per your directio
Jeff Janes writes:
> On Sat, Dec 1, 2012 at 12:47 PM, Tom Lane wrote:
>> Jeff Janes writes:
>>> In the newly fixed 9_2_STABLE, that problem still shows up the same as
>>> it does in 9.1.6.
>> I tried to reproduce this as per your directions, and see no problem in
>> HEAD. Recovery advances to
On Sat, Dec 1, 2012 at 12:47 PM, Tom Lane wrote:
> Jeff Janes writes:
>> On Wed, Nov 28, 2012 at 7:51 AM, Tom Lane wrote:
>>> Is this related at all to the problem discussed over at
>>> http://archives.postgresql.org/pgsql-general/2012-11/msg00709.php
>>> ? The conclusion-so-far in that thread
Jeff Janes writes:
> On Wed, Nov 28, 2012 at 7:51 AM, Tom Lane wrote:
>> Is this related at all to the problem discussed over at
>> http://archives.postgresql.org/pgsql-general/2012-11/msg00709.php
>> ? The conclusion-so-far in that thread seems to be that an error
>> ought to be thrown for reco
On Wed, Nov 28, 2012 at 7:51 AM, Tom Lane wrote:
> Heikki Linnakangas writes:
>> On 28.11.2012 06:27, Noah Misch wrote:
>>> I observed a similar problem with 9.2. Despite a restore_command that
>>> failed
>>> every time, startup from a hot backup completed. At the time, I suspected a
>>> mista
On Wed, Nov 28, 2012 at 5:37 AM, Heikki Linnakangas
wrote:
> On 28.11.2012 15:26, Andres Freund wrote:
>>
>
>
>> Can you reproduce the issue? If so, can you give an exact guide? If not,
>> do you still have the datadir et al. from above?
Yes, it is reliable enough to be used for "git bisect"
rm
Heikki Linnakangas writes:
> On 28.11.2012 06:27, Noah Misch wrote:
>> I observed a similar problem with 9.2. Despite a restore_command that failed
>> every time, startup from a hot backup completed. At the time, I suspected a
>> mistake on my part.
> I believe this was caused by this little ty
On 2012-11-28 16:34:55 +0200, Heikki Linnakangas wrote:
> On 28.11.2012 15:47, Andres Freund wrote:
> >I mean the label read by read_backup_label(). Jeff's mail indicated it
> >had CHECKPOINT_LOCATION at 1/188D8120 but redo started at 1/CD89E48.
>
> That's correct. The checkpoint was at 1/188D8120
On 28.11.2012 15:47, Andres Freund wrote:
I mean the label read by read_backup_label(). Jeff's mail indicated it
had CHECKPOINT_LOCATION at 1/188D8120 but redo started at 1/CD89E48.
That's correct. The checkpoint was at 1/188D8120, but it's redo pointer
was earlier, at 1/CD89E48, so that's whe
On 2012-11-28 15:37:38 +0200, Heikki Linnakangas wrote:
> On 28.11.2012 15:26, Andres Freund wrote:
> >Hm. Are you sure its actually reading your backup file? Its hard to say
> >without DEBUG1 output but I would tentatively say its not reading it at
> >all because the the "redo starts at ..." messa
On 28.11.2012 15:26, Andres Freund wrote:
Hm. Are you sure its actually reading your backup file? Its hard to say
without DEBUG1 output but I would tentatively say its not reading it at
all because the the "redo starts at ..." message indicates its not using
the checkpoint location from the backu
On 28.11.2012 06:27, Noah Misch wrote:
On Tue, Nov 27, 2012 at 10:08:12AM -0800, Jeff Janes wrote:
Doing PITR in 9.2.1, the system claims that it reached a consistent
recovery state immediately after redo starts.
This leads to it various mysterious failures, when it should instead
throw a "reque
On 2012-11-27 10:08:12 -0800, Jeff Janes wrote:
> Doing PITR in 9.2.1, the system claims that it reached a consistent
> recovery state immediately after redo starts.
> This leads to it various mysterious failures, when it should instead
> throw a "requested recovery stop point is before consistent
On Tue, Nov 27, 2012 at 10:08:12AM -0800, Jeff Janes wrote:
> Doing PITR in 9.2.1, the system claims that it reached a consistent
> recovery state immediately after redo starts.
> This leads to it various mysterious failures, when it should instead
> throw a "requested recovery stop point is before
Doing PITR in 9.2.1, the system claims that it reached a consistent
recovery state immediately after redo starts.
This leads to it various mysterious failures, when it should instead
throw a "requested recovery stop point is before consistent recovery
point" error.
(If you are unlucky, I think it m
61 matches
Mail list logo