On Fri, Mar 11, 2011 at 8:14 AM, Fujii Masao <masao.fu...@gmail.com> wrote: > On Mon, Mar 7, 2011 at 8:47 PM, Fujii Masao <masao.fu...@gmail.com> wrote: >> On Sun, Mar 6, 2011 at 11:10 PM, Fujii Masao <masao.fu...@gmail.com> wrote: >>> On Sun, Mar 6, 2011 at 5:03 PM, Fujii Masao <masao.fu...@gmail.com> wrote: >>>>> Why does internal_flush_if_writable compute bufptr differently from >>>>> internal_flush? And shouldn't it be static? >>>>> >>>>> It seems to me that this ought to be refactored so that you don't >>>>> duplicate so much code. Maybe static int internal_flush(bool >>>>> nonblocking). >>>>> >>>>> I don't think that the while (bufptr < bufend) loop needs to contain >>>>> the code to set and clear the nonblocking state. You could do the >>>>> whole loop with nonblocking mode turned on and then reenable it just >>>>> once at the end. Besides possibly being clearer, that would be more >>>>> efficient and leave less room for unexpected failures. >>>> >>>> All these comments seem to make sense. Will fix. Thanks! >>> >>> Done. I attached the updated patch. >> >> I rebased the patch against current git master. > > I added this replication timeout patch into next CF. > > I explain why this feature is required for the future review; > > Without this feature, walsender might unexpectedly remain for a while when > the standby crashes or the network outage happens. TCP keepalive can > improve this situation to a certain extent, but it's not perfect. Remaining > walsender can cause some problems. > > For example, when hot_standby_feedback is enabled, such a remaining > walsender would prevent oldest xmin from advancing and interfere with > vacuuming on the master. For example, when you use synchronous > replication and walsender in SYNC mode gets stuck, any synchronous > standby candidate cannot switch to SYNC mode until that walsender exits, > and all the transactions would pause. > > This feature causes walsender to exit when there is no reply from the > standby before the replication timeout expires. Then we can avoid the > above problems.
I think we should consider making this change for 9.1. This is a real wart, and it's going to become even more of a problem with sync rep, I think. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers