On Thu, Dec 6, 2012 at 8:39 PM, Amit Kapila <amit.kap...@huawei.com> wrote: > On Thursday, December 06, 2012 9:35 AM Kyotaro HORIGUCHI wrote: >> Hello, I have a problem with PostgreSQL 9.2 with Pacemaker. >> >> HA standby sometime failes to start under normal operation. >> >> Testing with a bare replication pair showed that the standby failes >> startup recovery under the operation sequence shown below. 9.3dev too, >> but 9.1 does not have this problem. This problem became apparent by the >> invalid-page check of xlog, but >> 9.1 also has same glitch potentially. >> >> After the investigation, the lag of minRecoveryPoint behind EndRecPtr in >> redo loop seems to be the cause. The lag brings about repetitive redoing >> of unrepeatable xlog sequences such as XLOG_HEAP2_VISIBLE -> >> SMGR_TRUNCATE on the same page. So I did the same aid work as >> xact_redo_commit_internal for smgr_redo. While doing this, I noticed >> that >> CheckRecoveryConsistency() in redo apply loop should be after redoing >> the record, so moved it. > > I think moving CheckRecoveryConsistency() after redo apply loop might cause > a problem. > As currently it is done before recoveryStopsHere() function, which can allow > connections > on HOTSTANDY. But now if due to some reason recovery pauses or stops due to > above function, > connections might not be allowed as CheckRecoveryConsistency() is not > called.
Yes, so we should just add the CheckRecoveryConsistency() call after rm_redo rather than moving it? This issue is related to the old discussion: http://archives.postgresql.org/pgsql-bugs/2012-09/msg00101.php Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers