On Fri, 26 Sep 2003, Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes: > > Could we allow slaves to check if the backend is still alive, perhaps by > > asking the postmaster, similar to what we do with the cancel signal --- > > that way, the slave would never time out and always wait if the master > > was alive. > > You're not considering the possibility of a transient communication > failure. The fact that you cannot currently contact the other guy > is not proof that he's not still alive. > > Example: > > Master Slave > ------ ----- > commit ready--> > <--OK > commit done->XX > > where "->XX" means the message gets lost due to network failure. Now 'k, but isn't alot of that a "retry" issue? we're talking TCP here, not UDP, which I *thought* was designed for transient network problems ... ? I would think that any implementation would have a timeout/retry GUC variable associated with it ... 'if no answer in x seconds, retry up to y times' ... if we are talking two computers sitting next to each other on a switch, you'd expect those to be low ... but if you were talking about two seperate geographical locations (and yes, I realize you are adding lag to the mix with waiting for responses), you'd expect those #s to rise ... ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend