The timeline for the events all dates MM/DD/YYYY 06/09/2009 1310 EDT - Hardware fault on primary database server db01pri 06/09/2009 1325 EDT - Failover to warm standby db01sec 06/12/2009 1615 EDT - db01pri server fixed and OS booted 06/15/2009 1115 EDT - started recovery of hotbackup from 06/15/2009 0205 EDT from db01sec onto db01pri 06/15/2009 1320 EDT - Attempted to start postgres on db01pri in warm standby mode 06/15/2009 1325 EDT - Failure to apply WAL log errors with "unexpected timeline ID" 06/15/2009 1340 EDT - Started a new hotbackup on db01sec 06/15/2009 1545 EDT - Started recovery hotbackup from 06/15/2009 1340 to db01pri 06/15/2000 1430 EDT - db01pri recovered and running in warm standby Here is the contents of the pg_xlog directory and the 00000004.history file: [postg...@db01pri ~]$ cat 00000004.history 1 0000000100000736000000A1 before transaction 0 at 1999-12-31 19:00:00-05 [postg...@db01pri ~]$ ls -l total 98468 -rw------- 1 postgres postgres 74 Jul 10 2008 00000002.history -rw------- 1 postgres postgres 74 Jun 9 13:29 00000003.history -rw------- 1 postgres postgres 16777216 Jun 16 08:45 0000000400000749000000C9 -rw------- 1 postgres postgres 16777216 Jun 16 08:46 0000000400000749000000CA -rw------- 1 postgres postgres 16777216 Jun 16 08:47 0000000400000749000000CB -rw------- 1 postgres postgres 74 Jun 9 13:33 00000004.history drwxr-xr-x 2 postgres postgres 32768 Jun 16 08:46 archive_status -rw------- 1 postgres postgres 16777216 Jun 9 13:45 xlogtemp.17243 -rw------- 1 postgres postgres 16777216 Jun 9 13:45 xlogtemp.17244 -rw------- 1 postgres postgres 16777216 Jun 9 13:52 xlogtemp.17397 [postg...@db01pri ~]$ Thanks again, Keith Tom Lane wrote: Keith Pierno <kpie...@lulu.com> writes:The backup used was from well after the failover time which is why I was concerned. Interestingly enough the logs are still all prefixed with 00000004... That just makes this problem extremely bizarre.Hmm, that *is* weird. It seems like the new primary must have reverted its decision to go from timeline 4 to timeline 6. (Which in itself is a bit odd; why not timeline 5?)Can you give us an exact sequence of events on the slave server/new primary around the time of the failover? Also, what was in the .history file when you found it, and are there any other history files? regards, tom lane |
- [BUGS] BUG #4854: Problems with replaying WAL files on Warm S... Keith Pierno
- Re: [BUGS] BUG #4854: Problems with replaying WAL files ... Tom Lane
- Re: [BUGS] BUG #4854: Problems with replaying WAL fi... Keith Pierno
- Re: [BUGS] BUG #4854: Problems with replaying WA... Tom Lane
- Re: [BUGS] BUG #4854: Problems with replayin... Keith Pierno
- Re: [BUGS] BUG #4854: Problems with rep... Tom Lane
- Re: [BUGS] BUG #4854: Problems with... Keith Pierno
- Re: [BUGS] BUG #4854: Problems ... Tom Lane
- Re: [BUGS] BUG #4854: Problems ... Keith Pierno
- Re: [BUGS] BUG #4854: Problems ... Tom Lane