On Wed, Mar 2, 2011 at 8:22 PM, Simon Riggs <si...@2ndquadrant.com> wrote: > The WALSender deliberately does *not* wake waiting users if the standby > disconnects. Doing so would break the whole reason for having sync rep > in the first place. What we do is allow a potential standby to takeover > the role of sync standby, if one is available. Or the failing standby > can reconnect and then release waiters.
If there is potential standby when synchronous standby has gone, I agree that it's not good idea to release the waiting backends soon. In this case, those backends should wait for next synchronous standby. On the other hand, if there is no potential standby, I think that the waiting backends should not wait for the timeout and should wake up as soon as synchronous standby has gone. Otherwise, those backends suspend for a long time (i.e., until the timeout expires), which would decrease the high-availability, I'm afraid. Keeping those backends waiting for the failed standby to reconnect is an idea. But this looks like the behavior for "allow_standalone_primary = off". If allow_standalone_primary = on, it looks more natural to make the primary work alone without waiting the timeout. > If we shutdown, then we wait for the shutdown commit record to be > transferred to our standby, so a normal or fast shutdown of the master > always leaves all connected standbys up to date. We already do that, so > sync rep doesn't touch that behaviour. If a standby is disconnected, > then it doesn't receive the shutdown checkpoint record. > > The wait state for a commit does not persist when we shutdown and > restart. > > Can you restate which bits of the above you think need to be changed? What I'm thinking is: when the waiting backends are released because of the timeout while the fast shutdown is being done in the master, those backends should not return the success indication to the client. Of course, in that case, WAL has already been flushed in the master, but I think that those backends should exit with FATAL error before returning the success. This is for avoiding breaking the synchronous replication rule, i.e., all the transaction which the client knows as committed must be committed in the synchronous standby after failover. If we allow those backends to return the success in that situation, the following scenario which can cause a data loss can happen. 1. The primary is running with allow_standalone_primary = on. There is only one (synchronous) standby connected. 2. The replication connection is closed because of the network outage. 3. While some backends are waiting for replication, the user requests fast shutdown in the master. 4. Since the timeout expires, those backends stop waiting and return the success indication to the client (but not replicated to the standby). 5. Since there is no backend waiting for replication, fast shutdown completes. 6. The clusterware like pacemaker detects the death of the primary and triggers the failover. 7. New primary doesn't have some transactions committed to the client, i.e., transaction lost happens!! Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers