On Tue, Dec 7, 2010 at 12:20 AM, Robert Haas <robertmh...@gmail.com> wrote: > Yeah. If we rely on the TCP send buffer filling up, then the amount > of time the master takes to notice a dead standby is going to be hard > for the user to predict. I think the standby ought to send some sort > of heartbeat and the master should declare the standby dead if it > doesn't see a heartbeat soon enough. Maybe the heartbeat could even > include the receive/fsync/replay LSNs, so that sync rep can use the > same machinery but with more aggressive policies about when they must > be sent.
OK. How about keepalive-like parameters and behaviors? replication_keepalives_idle replication_keepalives_interval replication_keepalives_count The master sends the keepalive packet if replication_keepalives_idle elapsed after receiving the last ACK packet including the receive/ fsync/replay LSNs from the standby. OTOH, the standby sends the ACK packet back to the master as soon as receiving the keepalive packet. If the master could not receive the ACK packet for replication_keepalives_interval, it repeats sending the keepalive packet and receiving the ACK replication_keepalives_count -1 times. If no ACK packet has finally arrived, the master thinks the standby has been dead. One obvious merit against my original proposal is that the master can notice the death of the standby even when there are no WAL records sendable. One demerit is that the standby needs to send some packets even in asynchronous replication. Thought? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers