On Wed, May 25, 2011 at 3:11 PM, Jaime Casanova <ja...@2ndquadrant.com> wrote: > On Wed, May 25, 2011 at 12:28 AM, Fujii Masao <masao.fu...@gmail.com> wrote: >> On Wed, May 25, 2011 at 2:16 PM, Heikki Linnakangas >>> By the time the standby has received that message, it might not be caught-up >>> anymore because new WAL might've been generated in the master already. >> >> Right. But, thanks to sync rep, until such a new WAL has been replicated to >> the standby, the commit of transaction is not visible to the client. So, >> even if >> there are some WAL not replicated to the standby, the clusterware can promote >> the standby safely without any data loss (to the client point of view), I >> think. > > then, you also need to transmit to the standby if it is the current > sync standby.
Yes. After further thought, we can promote the standby safely only when the corresponding walsender meets the following conditions: 1. sync_state is "sync" 2. the standby's flush_location is bigger than or equal to the smallest wait location in the sync rep queue. Which guarantees that all the committed transactions (i.e., their "success" indications have been returned to the client) have been replicated to the standby. Once the above conditions get satisfied, the failover is safe until sync_state is flipped to "async". By using this logic, walsender needs to check whether failover is safe, and send the message according to the result. One problem is that, when sync_state is flipped to "async", walsender might perform replication asynchronously before the standby receives the message indicating failover is unsafe. In this case, if the master crashes, the clusterware would wrongly think that failover is safe and promote the standby despite which causes data loss. To solve this problem, walsender would need to send that message *synchronously*, i.e., wait for the ACK of the message to arrive from the standby before actually changing sync_state to "async". Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers