Hello, At Fri, 28 Feb 2014 14:45:58 +0200, Heikki Linnakangas <hlinnakan...@vmware.com> wrote in <53108506.2010...@vmware.com> > > Yes, but the same stuation could be made by restarting crashed > > secondary. > > Yeah. > > > I have no idea about the scenario on whitch this behavior was regarded > > as > > undesirable but anyway I think that the secondry should start > > accepting > > client just after crash recovery is completed. > > Agreed, this is a bug. > > I don't think your patch is the right fix for this though. Setting > minRecoveryPoint to EndRecPtr is the right thing to do; EndRecPtr > points to the end of the last read and replayed record. What's wrong > in this case is lastReplayedEndRecptr. At the beginning of recovery, > it's initialized to the REDO point, but with a shutdown checkpoint, > that's not quite right. When starting from a shutdown checkpoint, REDO > points to the beginning of the shutdown record, but we've already > effectively replayed it. The next record we replay is the one after > the checkpoint.
It's more reasonable. I felt uncelar about that but I forgot to doubt the correctness of lastReplayedEndRecptr then, but surely the shutdown record itself was effectively alredy replayed when the recored is inseretd. > To see that, I added some elog(LOG) calls: > > ~/pgsql.93stable$ bin/postmaster -D data > LOG: database system was shut down at 2014-02-28 14:06:18 EET > LOG: ReadCheckpointRecord: 0/16479C98 > LOG: database system is ready to accept connections > LOG: autovacuum launcher started > ^CLOG: received fast shutdown request > LOG: aborting any active transactions > LOG: autovacuum launcher shutting down > LOG: shutting down > LOG: INSERT @ 0/16479D00: prev 0/16479C98; xid 0; len 72: XLOG - > checkpoint: redo 0/16479D00; tli 1; prev tli 1; fpw true; xid > 0/793393; oid 24988; multi 655288; offset 1356722; oldest xid 687 in > DB 1; oldest multi 1 in DB 1; oldest running xid 0; shutdown > LOG: xlog flush request 0/16479D68; write 0/0; flush 0/0 > LOG: database system is shut down > ~/pgsql.93stable$ bin/postmaster -D data > LOG: database system was shut down at 2014-02-28 14:06:23 EET > LOG: ReadCheckpointRecord: 0/16479D00 > LOG: database system is ready to accept connections > LOG: autovacuum launcher started > Killed > > At this point, the last record is the shutdown checkpoint, beginning > at 16479D00, and the server has been killed (immediate shutdown). > > ~/pgsql.93stable$ cp recovery.conf data/recovery.conf > ~/pgsql.93stable$ bin/postmaster -D data > LOG: database system was interrupted; last known up at 2014-02-28 > 14:06:29 EET > LOG: entering standby mode > LOG: ReadCheckpointRecord: 0/16479D00 > LOG: database system was not properly shut down; automatic recovery in > progress > LOG: record with zero length at 0/16479D68 > LOG: reached end of WAL in pg_xlog, entering archive recovery > LOG: EndRecPtr: 0/16479D68 lastReplayedEndRecPtr: 0/16479D00 > FATAL: could not connect to the primary server: could not connect to > server: Connection refused > ... > > Recovery starts from the checkpoint record, but lastReplayedEndRecPtr > is set to the *beginning* of the checkpoint record, even though the > checkpoint record has already been effectively replayed, by the feat > of starting recovery from it. EndRecPtr correctly points to the end of > the checkpoint record. Because of the incorrect lastReplayedEndRecPtr > value, the CheckRecoveryConsistency() call concludes that it's not > consistent. I completely understood the behavior thanks to your detailed explanation. (And how to use log messages effectively :-) I agree that the fix is appropriate. > I believe the attached fix is the right way to fix this. It also worked for me. Thank you. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers