On Friday, June 14, 2013 2:42 PM Samrat Revagade wrote: > Hello, > We have already started a discussion on pgsql-hackers for the problem of taking fresh backup during the failback operation here is the link for that: > http://www.postgresql.org/message-id/CAF8Q-Gxg3PQTf71NVECe-6OzRaew5pWhk7yQtb jgwrfu513...@mail.gmail.com > Let me again summarize the problem we are trying to address. > When the master fails, last few WAL files may not reach the standby. But the master may have gone ahead and made changes to its local file system after > flushing WAL to the local storage. So master contains some file system level changes that standby does not have. At this point, the data directory of > master is ahead of standby's data directory. > Subsequently, the standby will be promoted as new master. Later when the old master wants to be a standby of the new master, it can't just join the > setup since there is inconsistency in between these two servers. We need to take the fresh backup from the new master. This can happen in both the > synchronous as well as asynchronous replication. > Fresh backup is also needed in case of clean switch-over because in the current HEAD, the master does not wait for the standby to receive all the WAL > up to the shutdown checkpoint record before shutting down the connection. Fujii Masao has already submitted a patch to handle clean switch-over case, > but the problem is still remaining for failback case. > The process of taking fresh backup is very time consuming when databases are of very big sizes, say several TB's, and when the servers are connected > over a relatively slower link. This would break the service level agreement of disaster recovery system. So there is need to improve the process of > disaster recovery in PostgreSQL. One way to achieve this is to maintain consistency between master and standby which helps to avoid need of fresh > backup. > So our proposal on this problem is that we must ensure that master should not make any file system level changes without confirming that the > corresponding WAL record is replicated to the standby. How will you take care of extra WAL on old master during recovery. If it plays the WAL which has not reached new-master, it can be a problem.
With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers