Everything seems settled up above my head while sleeping..
Sorry for crumsy test script, and thank you for refining it, Mitsumasa.
And thank you for fixing the bug and the detailed explanation, Heikki.
I confirmed that the problem is fixed also for me at origin/REL9_2_STABLE.
> I understand thi
(2013/03/07 19:41), Heikki Linnakangas wrote:
On 07.03.2013 10:05, KONDO Mitsumasa wrote:
(2013/03/06 16:50), Heikki Linnakangas wrote:>
Yeah. That fix isn't right, though; XLogPageRead() is supposed to
return true on success, and false on error, and the patch makes it
return 'true' on error, i
On 07.03.2013 10:05, KONDO Mitsumasa wrote:
(2013/03/06 16:50), Heikki Linnakangas wrote:>
Yeah. That fix isn't right, though; XLogPageRead() is supposed to
return true on success, and false on error, and the patch makes it
return 'true' on error, if archive recovery was requested but we're
stil
(2013/03/06 16:50), Heikki Linnakangas wrote:>
Hi,
Horiguch's patch does not seem to record minRecoveryPoint in ReadRecord();
Attempt patch records minRecoveryPoint.
[crash recovery -> record minRecoveryPoint in control file -> archive
recovery]
I think that this is an original intention of Heik
On 05.03.2013 14:09, KONDO Mitsumasa wrote:
Hi,
Horiguch's patch does not seem to record minRecoveryPoint in ReadRecord();
Attempt patch records minRecoveryPoint.
[crash recovery -> record minRecoveryPoint in control file -> archive
recovery]
I think that this is an original intention of Heikki'
Hi, I suppose the attached patch is close to the solution.
> I think that this is an original intention of Heikki's patch.
I noticed that archive recovery will be turned on in
next_record_is_invalid thanks to your patch.
> On the other hand, your patch fixes that point but ReadRecord
> runs on t
Hmm..
> Horiguch's patch does not seem to record minRecoveryPoint in
> ReadRecord();
> Attempt patch records minRecoveryPoint.
> [crash recovery -> record minRecoveryPoint in control file -> archive
> recovery]
> I think that this is an original intention of Heikki's patch.
It could be. Before th
Hi,
Horiguch's patch does not seem to record minRecoveryPoint in ReadRecord();
Attempt patch records minRecoveryPoint.
[crash recovery -> record minRecoveryPoint in control file -> archive recovery]
I think that this is an original intention of Heikki's patch.
I also found a bug in latest 9.2_st
Sorry, I sent wrong script.
> The head of origin/REL9_2_STABLE shows the behavior I metioned in
> the last message when using the shell script attached. 9.3dev
> runs as expected.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
#! /bin/sh
pgpath="$HOME/bin/pgsql_924b"
echo $PATH |
Hello, I could cause the behavior and might understand the cause.
The head of origin/REL9_2_STABLE shows the behavior I metioned in
the last message when using the shell script attached. 9.3dev
runs as expected.
In XLogPageRead, when RecPtr goes beyond the last page, the
current xlog file is rele
This is an interim report for this patch.
We found that PostgreSQL with this patch unexpctedly becomes
primary when starting up as standby. We'll do further
investigation for the behavior.
> > Anyway, I've committed this to master and 9.2 now.
>
> This seems to fix the issue. We'll examine this
Folks,
Is there any way this particular issue could cause data corruption
without causing a crash? I don't see a way for it to do so, but I
wanted to verify.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To
However this has become useless, I want to explain about how this
works.
> > I tried to postpone smgrtruncate TO the next checktpoint.
>
> Umm, why? I don't understand this patch at all.
This inhibits truncate files after (quite vague in the patch:-)
the previous checkpoint by hindering the dele
At Fri, 22 Feb 2013 11:42:39 +0200, Heikki Linnakangas
wrote in <51273d8f.7060...@vmware.com>
> On 15.02.2013 10:33, Kyotaro HORIGUCHI wrote:
> > In HA DB cluster cosists of Pacemaker and PostgreSQL, PostgreSQL
> > is stopped by 'pg_ctl stop -m i' regardless of situation.
>
> That seems like a b
Hello,
> Anyway, I've committed this to master and 9.2 now.
This seems to fix the issue. We'll examine this further.
Thank you.
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
h
On 15.02.2013 10:33, Kyotaro HORIGUCHI wrote:
Sorry, I omitted to show how we found this issue.
In HA DB cluster cosists of Pacemaker and PostgreSQL, PostgreSQL
is stopped by 'pg_ctl stop -m i' regardless of situation.
That seems like a bad idea. If nothing else, crash recovery can take a
lon
On 14.02.2013 19:18, Fujii Masao wrote:
Yes. And the resource agent for streaming replication in Pacemaker (it's the
OSS clusterware) is the user of that archive recovery scenario, too. When it
starts up the server, it always creates the recovery.conf and starts the server
as the standby. It cann
On 22.02.2013 02:13, Michael Paquier wrote:
On Thu, Feb 21, 2013 at 11:09 PM, Heikki Linnakangas<
hlinnakan...@vmware.com> wrote:
On 15.02.2013 15:49, Heikki Linnakangas wrote:
Attached is a patch for git master. The basic idea is to split
InArchiveRecovery into two variables, InArchiveRecov
On Thu, Feb 21, 2013 at 11:09 PM, Heikki Linnakangas <
hlinnakan...@vmware.com> wrote:
> On 15.02.2013 15:49, Heikki Linnakangas wrote:
>
>> Attached is a patch for git master. The basic idea is to split
>> InArchiveRecovery into two variables, InArchiveRecovery and
>> ArchiveRecoveryRequested. Ar
On 15.02.2013 15:49, Heikki Linnakangas wrote:
Attached is a patch for git master. The basic idea is to split
InArchiveRecovery into two variables, InArchiveRecovery and
ArchiveRecoveryRequested. ArchiveRecoveryRequested is set when
recovery.conf exists. But if we don't know how far we need to re
On 20.02.2013 10:01, Kyotaro HORIGUCHI wrote:
Sorry, Let me correct a bit.
I tried to postpone smgrtruncate after the next checkpoint. This
I tried to postpone smgrtruncate TO the next checktpoint.
Umm, why? I don't understand this patch at all.
- Heikki
--
Sent via pgsql-hackers mailing
Sorry, Let me correct a bit.
> I tried to postpone smgrtruncate after the next checkpoint. This
I tried to postpone smgrtruncate TO the next checktpoint.
> is similar to what hotstandby feedback does to vacuum. It seems
> to be working fine but I warry that it might also bloats the
> table. I h
Hello, I looked this from another point of view.
I consider the current discussion to be based on how to predict
the last consistency point. But there is another aspect of this
issue.
I tried to postpone smgrtruncate after the next checkpoint. This
is similar to what hotstandby feedback does to v
On Mon, Feb 18, 2013 at 8:27 PM, Heikki Linnakangas
wrote:
> backupStartPoint is set, which signals recovery to wait for an end-of-backup
> record, until the system is considered consistent. If the backup is taken
> from a hot standby, backupEndPoint is set, instead of inserting an
> end-of-backup
On 16.02.2013 10:40, Ants Aasma wrote:
On Fri, Feb 15, 2013 at 3:49 PM, Heikki Linnakangas
wrote:
While this solution would help solve my issue, it assumes that the
correct amount of WAL files are actually there. Currently the docs for
setting up a standby refer to "24.3.4. Recovering Using a
On Fri, Feb 15, 2013 at 3:49 PM, Heikki Linnakangas
wrote:
>> While this solution would help solve my issue, it assumes that the
>> correct amount of WAL files are actually there. Currently the docs for
>> setting up a standby refer to "24.3.4. Recovering Using a Continuous
>> Archive Backup", and
On 15.02.2013 13:05, Ants Aasma wrote:
On Wed, Feb 13, 2013 at 10:52 PM, Simon Riggs wrote:
The problem is that we startup Hot Standby before we hit the min
recovery point because that isn't recorded. For me, the thing to do is
to make the min recovery point == end of WAL when state is
DB_IN_PR
On Wed, Feb 13, 2013 at 10:52 PM, Simon Riggs wrote:
> The problem is that we startup Hot Standby before we hit the min
> recovery point because that isn't recorded. For me, the thing to do is
> to make the min recovery point == end of WAL when state is
> DB_IN_PRODUCTION. That way we don't need t
Sorry, I omitted to show how we found this issue.
In HA DB cluster cosists of Pacemaker and PostgreSQL, PostgreSQL
is stopped by 'pg_ctl stop -m i' regardless of situation.
On the other hand, PosrgreSQL RA(Rsource Agent) is obliged to
start the master node via hot standby state because of the
res
On Thu, Feb 14, 2013 at 5:52 AM, Simon Riggs wrote:
> On 13 February 2013 09:04, Heikki Linnakangas wrote:
>
>> Without step 3, the server would perform crash recovery, and it would work.
>> But because of the recovery.conf file, the server goes into archive
>> recovery, and because minRecoveryPo
On Thu, Feb 14, 2013 at 5:15 AM, Heikki Linnakangas
wrote:
> On 13.02.2013 17:02, Tom Lane wrote:
>>
>> Heikki Linnakangas writes:
>>>
>>> At least in back-branches, I'd call this a pilot error. You can't turn a
>>> master into a standby just by creating a recovery.conf file. At least
>>> not if
On Feb 13, 2013 10:29 PM, "Heikki Linnakangas"
wrote:
> Hmm, I just realized a little problem with that approach. If you take a
base backup using an atomic filesystem backup from a running server, and
start archive recovery from that, that's essentially the same thing as
Kyotaro's test case.
Coin
On 13 February 2013 09:04, Heikki Linnakangas wrote:
> Without step 3, the server would perform crash recovery, and it would work.
> But because of the recovery.conf file, the server goes into archive
> recovery, and because minRecoveryPoint is not set, it assumes that the
> system is consistent
On 13.02.2013 17:02, Tom Lane wrote:
Heikki Linnakangas writes:
At least in back-branches, I'd call this a pilot error. You can't turn a
master into a standby just by creating a recovery.conf file. At least
not if the master was not shut down cleanly first.
...
I'm not sure that's worth the tro
Heikki Linnakangas writes:
> On 13.02.2013 21:30, Tom Lane wrote:
>> Well, archive recovery is a different scenario --- Simon was questioning
>> whether we need a minRecoveryPoint mechanism in crash recovery, or at
>> least that's what I thought he asked.
> Ah, ok. The short answer to that is "no
Heikki Linnakangas writes:
> The problem we're trying to solve is determining how much WAL needs to
> be replayed until the database is consistent again. In crash recovery,
> the answer is "all of it". That's why the CRC in the WAL is essential;
> it's required to determine where the WAL ends.
On 13.02.2013 21:30, Tom Lane wrote:
Heikki Linnakangas writes:
On 13.02.2013 21:21, Tom Lane wrote:
It would only be broken if someone interrupted a crash recovery
mid-flight and tried to establish a recovery stop point before the end
of WAL, no? Why don't we just forbid that case? This woul
Heikki Linnakangas writes:
> On 13.02.2013 21:21, Tom Lane wrote:
>> It would only be broken if someone interrupted a crash recovery
>> mid-flight and tried to establish a recovery stop point before the end
>> of WAL, no? Why don't we just forbid that case? This would either be
>> the same as, or
On 13.02.2013 21:03, Tom Lane wrote:
Simon Riggs writes:
On 13 February 2013 09:04, Heikki Linnakangas wrote:
To be precise, we'd need to update the control file on every XLogFlush(),
like we do during archive recovery. That would indeed be unacceptable from a
performance point of view. Updat
On 13.02.2013 21:21, Tom Lane wrote:
Heikki Linnakangas writes:
Well, no-one's complained about the performance. From a robustness point
of view, it might be good to keep the minRecoveryPoint value in a
separate file, for example, to avoid rewriting the control file that
often. Then again, why
Heikki Linnakangas writes:
> Well, no-one's complained about the performance. From a robustness point
> of view, it might be good to keep the minRecoveryPoint value in a
> separate file, for example, to avoid rewriting the control file that
> often. Then again, why fix it when it's not broken.
On 13.02.2013 20:25, Simon Riggs wrote:
On 13 February 2013 09:04, Heikki Linnakangas wrote:
To be precise, we'd need to update the control file on every XLogFlush(),
like we do during archive recovery. That would indeed be unacceptable from a
performance point of view. Updating the control fi
Simon Riggs writes:
> On 13 February 2013 09:04, Heikki Linnakangas wrote:
>> To be precise, we'd need to update the control file on every XLogFlush(),
>> like we do during archive recovery. That would indeed be unacceptable from a
>> performance point of view. Updating the control file that ofte
On 13 February 2013 09:04, Heikki Linnakangas wrote:
> To be precise, we'd need to update the control file on every XLogFlush(),
> like we do during archive recovery. That would indeed be unacceptable from a
> performance point of view. Updating the control file that often would also
> be bad for
Heikki Linnakangas writes:
> At least in back-branches, I'd call this a pilot error. You can't turn a
> master into a standby just by creating a recovery.conf file. At least
> not if the master was not shut down cleanly first.
> ...
> I'm not sure that's worth the trouble, though. Perhaps it wou
On 13.02.2013 09:46, Kyotaro HORIGUCHI wrote:
In this case, the FINAL consistency point is at the
XLOG_SMGR_TRUNCATE record, but current implemet does not record
the consistency point (checkpoint, or commit or smgr_truncate)
itself, so we cannot predict the final consistency point on
starting of
Hello, 9.2.3 crashes during archive recovery.
This was also corrected at some point on origin/master with
another problem fixed by the commit below if my memory is
correct. But current HEAD and 9.2.3 crashes during archive
recovery (not on standby) by the 'marking deleted page visible'
problem.
h
47 matches
Mail list logo