I have observed that currently incase there is a network break between
master and standby, walsender process gets terminated immediately, however 
walreceiver detects the breakage after long time. 



I could see that there is replication_timeout configuration parameter,
walsender checks for replication_timeout and exits after that timeout.



Shouldn't for walreceiver, there be a mechanism so that it can detect n/w
failure sooner?


Basic Steps to observe above behavior 
1. Both master and standby machine are connected normally, 
2. then you use the command: ifconfig ip down; make the network card of
master and standby down, 
Observation 
master can detect connect abnormal, but the standby can't detect connect
abnormal and show a connected channel long time. 



Note - Earlier I had sent this on Hackers list also, I just wanted to know
that is it the behavior as defined by PostgreSQL or is it a bug or a new
feature in itself.

          In case it is not clear, I will raise a bug.

          


With Regards, 
Amit Kapila

 

Reply via email to