The timeline for the events all dates MM/DD/YYYY

    06/09/2009 1310 EDT - Hardware fault on primary database server db01pri
    06/09/2009 1325 EDT - Failover to warm standby db01sec
    06/12/2009 1615 EDT - db01pri server fixed and OS booted
    06/15/2009 1115 EDT - started recovery of hotbackup from 06/15/2009 0205 EDT from db01sec onto db01pri
    06/15/2009 1320 EDT - Attempted to start postgres on db01pri in warm standby mode
    06/15/2009 1325 EDT - Failure to apply WAL log errors with "unexpected timeline ID"
    06/15/2009 1340 EDT - Started a new hotbackup on db01sec
    06/15/2009 1545 EDT - Started recovery hotbackup from 06/15/2009 1340 to db01pri
    06/15/2000 1430 EDT - db01pri recovered and running in warm standby

Here is the contents of the pg_xlog directory and the 00000004.history file:

[postg...@db01pri ~]$  cat 00000004.history
1    0000000100000736000000A1    before transaction 0 at 1999-12-31 19:00:00-05
[postg...@db01pri ~]$  ls -l
total 98468
-rw-------  1 postgres postgres       74 Jul 10  2008 00000002.history
-rw-------  1 postgres postgres       74 Jun  9 13:29 00000003.history
-rw-------  1 postgres postgres 16777216 Jun 16 08:45 0000000400000749000000C9
-rw-------  1 postgres postgres 16777216 Jun 16 08:46 0000000400000749000000CA
-rw-------  1 postgres postgres 16777216 Jun 16 08:47 0000000400000749000000CB
-rw-------  1 postgres postgres       74 Jun  9 13:33 00000004.history
drwxr-xr-x  2 postgres postgres    32768 Jun 16 08:46 archive_status
-rw-------  1 postgres postgres 16777216 Jun  9 13:45 xlogtemp.17243
-rw-------  1 postgres postgres 16777216 Jun  9 13:45 xlogtemp.17244
-rw-------  1 postgres postgres 16777216 Jun  9 13:52 xlogtemp.17397
[postg...@db01pri ~]$

Thanks again,

Keith

Tom Lane wrote:
Keith Pierno <kpie...@lulu.com> writes:
  
The backup used was from well after the failover time which is why I
was concerned. Interestingly enough the logs are still all prefixed
with 00000004... That just makes this problem extremely bizarre.
    

Hmm, that *is* weird.  It seems like the new primary must have reverted
its decision to go from timeline 4 to timeline 6.  (Which in itself is
a bit odd; why not timeline 5?)

Can you give us an exact sequence of events on the slave server/new
primary around the time of the failover?  Also, what was in the .history
file when you found it, and are there any other history files?

			regards, tom lane
  

Reply via email to