On Mon, Dec 20, 2010 at 3:17 AM, Fujii Masao <masao.fu...@gmail.com> wrote: > On Tue, Dec 7, 2010 at 12:20 AM, Robert Haas <robertmh...@gmail.com> wrote: >> Yeah. If we rely on the TCP send buffer filling up, then the amount >> of time the master takes to notice a dead standby is going to be hard >> for the user to predict. I think the standby ought to send some sort >> of heartbeat and the master should declare the standby dead if it >> doesn't see a heartbeat soon enough. Maybe the heartbeat could even >> include the receive/fsync/replay LSNs, so that sync rep can use the >> same machinery but with more aggressive policies about when they must >> be sent. > > OK. How about keepalive-like parameters and behaviors? > > replication_keepalives_idle > replication_keepalives_interval > replication_keepalives_count > > The master sends the keepalive packet if replication_keepalives_idle > elapsed after receiving the last ACK packet including the receive/ > fsync/replay LSNs from the standby. OTOH, the standby sends the > ACK packet back to the master as soon as receiving the keepalive > packet. > > If the master could not receive the ACK packet for > replication_keepalives_interval, it repeats sending the keepalive > packet and receiving the ACK replication_keepalives_count -1 > times. If no ACK packet has finally arrived, the master thinks the > standby has been dead.
This doesn't really make sense, because you're connecting over a TCP connection. Once you send the first keepalive, TCP will keep retrying in some way that we have no control over. If those packets aren't getting through, adding more data to what has to be transmitted seems unlikely to do anything useful. I think the parameters we can usefully set are: - how long does the master wait before sending a keepalive request? - how long does the master wait after sending a keepalive before declaring the slave dead and closing the connection? But this can be further simplified. The slave doesn't really need the master to prompt it to send acknowledgments. It only needs to send them sufficiently often. As part of the start-replication sequence, let's have the master tell the slave "send me an acknowledgment at least every N seconds". And then the slave must do that. The master then has some value K > N, such that if no acknowledgment is received after K seconds, the connection is disconnected. The only reason to have the master send explicit keepalive requests (vs. just telling the client the interval) is if the master might request them for some reason other than timer expiration. Since the main point of this is to detect the situation where the slave has e.g. power cycled so that the connection is gone but the master doesn't know it, you could imagine a system where, when a new replication connection is received, we request keepalives on all of the existing connections to see if any of them are defunct. But I don't really think it needs to be quite that complicated. Another consideration is that you could configure the keepalive-frequency on the slave and the declare-dead-time on the master. Then the master wouldn't need to tell the slave the keepalive-frequency at replication start-up time. But that might also increase the chances of incompatible settings (e.g. slave's keepalive frequency is >= master's declare-dead-time), which would result in a lot of unnecessary reconnects. If both parameters are configured on the master, then we can enforce that declare-dead-time > keepalive-frequency. So I suggest: replication_keepalive_time - how often the slave is instructed to send acknowledgments when idle replication_idle_timeout - the period of inactivity after which the master closes the connection to the slave -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers