Re: [HACKERS] Replication server timeout patch

2011-03-31 Thread Heikki Linnakangas
On 31.03.2011 05:46, Fujii Masao wrote: On Wed, Mar 30, 2011 at 10:54 PM, Robert Haas wrote: On Wed, Mar 30, 2011 at 4:08 AM, Fujii Masao wrote: On Wed, Mar 30, 2011 at 5:03 PM, Heikki Linnakangas wrote: On 30.03.2011 10:58, Fujii Masao wrote: On Wed, Mar 30, 2011 at 4:24 PM, Heikki Linn

Re: [HACKERS] Replication server timeout patch

2011-03-30 Thread Fujii Masao
On Wed, Mar 30, 2011 at 10:54 PM, Robert Haas wrote: > On Wed, Mar 30, 2011 at 4:08 AM, Fujii Masao wrote: >> On Wed, Mar 30, 2011 at 5:03 PM, Heikki Linnakangas >> wrote: >>> On 30.03.2011 10:58, Fujii Masao wrote: On Wed, Mar 30, 2011 at 4:24 PM, Heikki Linnakangas  wrote:

Re: [HACKERS] Replication server timeout patch

2011-03-30 Thread Robert Haas
On Wed, Mar 30, 2011 at 4:08 AM, Fujii Masao wrote: > On Wed, Mar 30, 2011 at 5:03 PM, Heikki Linnakangas > wrote: >> On 30.03.2011 10:58, Fujii Masao wrote: >>> >>> On Wed, Mar 30, 2011 at 4:24 PM, Heikki Linnakangas >>>  wrote: >>> +        A value of zero means wait forever.  This parameter c

Re: [HACKERS] Replication server timeout patch

2011-03-30 Thread Fujii Masao
On Wed, Mar 30, 2011 at 5:03 PM, Heikki Linnakangas wrote: > On 30.03.2011 10:58, Fujii Masao wrote: >> >> On Wed, Mar 30, 2011 at 4:24 PM, Heikki Linnakangas >>  wrote: >> +        A value of zero means wait forever.  This parameter can only be >> set in >> >> The first sentence sounds misleadin

Re: [HACKERS] Replication server timeout patch

2011-03-30 Thread Heikki Linnakangas
On 30.03.2011 10:58, Fujii Masao wrote: On Wed, Mar 30, 2011 at 4:24 PM, Heikki Linnakangas wrote: +A value of zero means wait forever. This parameter can only be set in The first sentence sounds misleading. Even if you set the parameter to zero, replication connections can be termina

Re: [HACKERS] Replication server timeout patch

2011-03-30 Thread Fujii Masao
On Wed, Mar 30, 2011 at 4:24 PM, Heikki Linnakangas wrote: >> +       pq_putmessage_noblock('d', msgbuf, 1 + >> sizeof(WalDataMessageHeader) + nbytes); >> >> Don't we need to check the return value of pq_putmessage_noblock? That >> can return EOF when trouble happens (for example the send system c

Re: [HACKERS] Replication server timeout patch

2011-03-30 Thread Heikki Linnakangas
On 29.03.2011 07:55, Fujii Masao wrote: On Mon, Mar 28, 2011 at 7:49 PM, Heikki Linnakangas wrote: pq_flush_if_writable() calls internal_flush() without using PG_TRY block. This seems unsafe because for example pgwin32_waitforsinglesocket() called by secure_write() can throw ERROR. Perhaps i

Re: [HACKERS] Replication server timeout patch

2011-03-29 Thread Fujii Masao
On Wed, Mar 30, 2011 at 1:04 AM, Robert Haas wrote: >> COMMERROR exists to keep us from trying to send an error report down a >> failed socket.  I would assume (perhaps wrongly) that >> walsender/walreceiver don't try to push error reports across the socket >> anyway, only to the postmaster log.  

Re: [HACKERS] Replication server timeout patch

2011-03-29 Thread Robert Haas
On Tue, Mar 29, 2011 at 9:24 AM, Tom Lane wrote: > Fujii Masao writes: >> On Mon, Mar 28, 2011 at 7:49 PM, Heikki Linnakangas >>> Should we use COMMERROR instead of ERROR if we fail to put the socket in the >>> right mode? > >> Maybe. > > COMMERROR exists to keep us from trying to send an error r

Re: [HACKERS] Replication server timeout patch

2011-03-29 Thread Tom Lane
Fujii Masao writes: > On Mon, Mar 28, 2011 at 7:49 PM, Heikki Linnakangas >> Should we use COMMERROR instead of ERROR if we fail to put the socket in the >> right mode? > Maybe. COMMERROR exists to keep us from trying to send an error report down a failed socket. I would assume (perhaps wrongly

Re: [HACKERS] Replication server timeout patch

2011-03-28 Thread Fujii Masao
On Mon, Mar 28, 2011 at 7:49 PM, Heikki Linnakangas wrote: >> pq_flush_if_writable() calls internal_flush() without using PG_TRY block. >> This seems unsafe because for example pgwin32_waitforsinglesocket() >> called by secure_write() can throw ERROR. > > Perhaps it's time to give up on the assump

Re: [HACKERS] Replication server timeout patch

2011-03-28 Thread Heikki Linnakangas
On 24.03.2011 15:24, Fujii Masao wrote: On Wed, Mar 23, 2011 at 7:33 PM, Heikki Linnakangas wrote: I don't much like the API for this. Walsender shouldn't need to know about the details of the FE/BE protocol, pq_putbytes_if_available() seems too low level to be useful. I think a better API wo

Re: [HACKERS] Replication server timeout patch

2011-03-25 Thread Robert Haas
On Wed, Mar 23, 2011 at 6:33 AM, Heikki Linnakangas wrote: > On 16.03.2011 11:11, Fujii Masao wrote: >> >> On Wed, Mar 16, 2011 at 4:49 PM, Fujii Masao >>  wrote: >>> >>> Agreed. I'll change the patch. >> >> Done. I attached the updated patch. > > I don't much like the API for this. Walsender shou

Re: [HACKERS] Replication server timeout patch

2011-03-24 Thread Fujii Masao
On Wed, Mar 23, 2011 at 7:33 PM, Heikki Linnakangas wrote: > I don't much like the API for this. Walsender shouldn't need to know about > the details of the FE/BE protocol, pq_putbytes_if_available() seems too low > level to be useful. > > I think a better API would be to have a non-blocking versi

Re: [HACKERS] Replication server timeout patch

2011-03-23 Thread Heikki Linnakangas
On 16.03.2011 11:11, Fujii Masao wrote: On Wed, Mar 16, 2011 at 4:49 PM, Fujii Masao wrote: Agreed. I'll change the patch. Done. I attached the updated patch. I don't much like the API for this. Walsender shouldn't need to know about the details of the FE/BE protocol, pq_putbytes_if_availa

Re: [HACKERS] Replication server timeout patch

2011-03-16 Thread Fujii Masao
On Wed, Mar 16, 2011 at 4:49 PM, Fujii Masao wrote: > Agreed. I'll change the patch. Done. I attached the updated patch. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center replication_timeout_v6.patch Description: Binary data -- Sent via pgsq

Re: [HACKERS] Replication server timeout patch

2011-03-16 Thread Fujii Masao
On Sat, Mar 12, 2011 at 4:34 AM, Robert Haas wrote: > On Fri, Mar 11, 2011 at 8:29 AM, Fujii Masao wrote: >>> I think we should consider making this change for 9.1.  This is a real >>> wart, and it's going to become even more of a problem with sync rep, I >>> think. >> >> Yeah, that's a welcome!

Re: [HACKERS] Replication server timeout patch

2011-03-11 Thread Robert Haas
On Fri, Mar 11, 2011 at 8:29 AM, Fujii Masao wrote: >> I think we should consider making this change for 9.1.  This is a real >> wart, and it's going to become even more of a problem with sync rep, I >> think. > > Yeah, that's a welcome! Please feel free to review the patch. I discussed this with

Re: [HACKERS] Replication server timeout patch

2011-03-11 Thread Bruce Momjian
Fujii Masao wrote: > On Fri, Mar 11, 2011 at 10:18 PM, Robert Haas wrote: > >> I added this replication timeout patch into next CF. > >> > >> I explain why this feature is required for the future review; > >> > >> Without this feature, walsender might unexpectedly remain for a while when > >> the

Re: [HACKERS] Replication server timeout patch

2011-03-11 Thread Fujii Masao
On Fri, Mar 11, 2011 at 10:18 PM, Robert Haas wrote: >> I added this replication timeout patch into next CF. >> >> I explain why this feature is required for the future review; >> >> Without this feature, walsender might unexpectedly remain for a while when >> the standby crashes or the network ou

Re: [HACKERS] Replication server timeout patch

2011-03-11 Thread Robert Haas
On Fri, Mar 11, 2011 at 8:14 AM, Fujii Masao wrote: > On Mon, Mar 7, 2011 at 8:47 PM, Fujii Masao wrote: >> On Sun, Mar 6, 2011 at 11:10 PM, Fujii Masao wrote: >>> On Sun, Mar 6, 2011 at 5:03 PM, Fujii Masao wrote: > Why does internal_flush_if_writable compute bufptr differently from >

Re: [HACKERS] Replication server timeout patch

2011-03-11 Thread Fujii Masao
On Mon, Mar 7, 2011 at 8:47 PM, Fujii Masao wrote: > On Sun, Mar 6, 2011 at 11:10 PM, Fujii Masao wrote: >> On Sun, Mar 6, 2011 at 5:03 PM, Fujii Masao wrote: Why does internal_flush_if_writable compute bufptr differently from internal_flush?  And shouldn't it be static? It s

Re: [HACKERS] Replication server timeout patch

2011-03-06 Thread Fujii Masao
On Sun, Mar 6, 2011 at 3:23 AM, Robert Haas wrote: > On Mon, Feb 28, 2011 at 8:08 AM, Fujii Masao wrote: >> On Sun, Feb 27, 2011 at 11:52 AM, Fujii Masao wrote: There are two things that I think are pretty clear.  If the receiver has wal_receiver_status_interval=0, then we should ignor

Re: [HACKERS] Replication server timeout patch

2011-03-05 Thread Robert Haas
On Mon, Feb 28, 2011 at 8:08 AM, Fujii Masao wrote: > On Sun, Feb 27, 2011 at 11:52 AM, Fujii Masao wrote: >>> There are two things that I think are pretty clear.  If the receiver >>> has wal_receiver_status_interval=0, then we should ignore >>> replication_timeout for that connection. >> >> The

Re: [HACKERS] Replication server timeout patch

2011-02-28 Thread Fujii Masao
On Sun, Feb 27, 2011 at 11:52 AM, Fujii Masao wrote: >> There are two things that I think are pretty clear.  If the receiver >> has wal_receiver_status_interval=0, then we should ignore >> replication_timeout for that connection. > > The patch still doesn't check that wal_receiver_status_interval

Re: [HACKERS] Replication server timeout patch

2011-02-26 Thread Fujii Masao
On Fri, Feb 18, 2011 at 12:10 PM, Robert Haas wrote: > IMHO, that's so broken as to be useless. > > I would really like to have a solution to this problem, though. > Relying on TCP keepalives is weak. Agreed. I updated the replication timeout patch which I submitted before. http://archives.postg

Re: [HACKERS] Replication server timeout patch

2011-02-17 Thread Robert Haas
On Thu, Feb 17, 2011 at 9:10 PM, Fujii Masao wrote: > On Fri, Feb 18, 2011 at 7:55 AM, Josh Berkus wrote: >>> So, in summary, the position is that we have a timeout, but that timeout >>> doesn't work in all cases. But it does work in some, so that seems >>> enough for me to say "let's commit". No

Re: [HACKERS] Replication server timeout patch

2011-02-17 Thread Fujii Masao
On Fri, Feb 18, 2011 at 7:55 AM, Josh Berkus wrote: >> So, in summary, the position is that we have a timeout, but that timeout >> doesn't work in all cases. But it does work in some, so that seems >> enough for me to say "let's commit". Not committing gives us nothing at >> all, which is as much

Re: [HACKERS] Replication server timeout patch

2011-02-17 Thread Simon Riggs
On Thu, 2011-02-17 at 16:42 -0500, Robert Haas wrote: > > > > So, in summary, the position is that we have a timeout, but that timeout > > doesn't work in all cases. But it does work in some, so that seems > > enough for me to say "let's commit". Not committing gives us nothing at > > all, which is

Re: [HACKERS] Replication server timeout patch

2011-02-17 Thread Josh Berkus
> So, in summary, the position is that we have a timeout, but that timeout > doesn't work in all cases. But it does work in some, so that seems > enough for me to say "let's commit". Not committing gives us nothing at > all, which is as much use as a chocolate teapot. Can someone summarize the ca

Re: [HACKERS] Replication server timeout patch

2011-02-17 Thread Robert Haas
On Thu, Feb 17, 2011 at 4:21 PM, Simon Riggs wrote: > On Wed, 2011-02-16 at 11:34 +0900, Fujii Masao wrote: >> On Tue, Feb 15, 2011 at 7:13 AM, Daniel Farina wrote: >> > On Mon, Feb 14, 2011 at 12:48 AM, Fujii Masao >> > wrote: >> >> On Sat, Feb 12, 2011 at 8:58 AM, Daniel Farina wrote: >> >>>

Re: [HACKERS] Replication server timeout patch

2011-02-17 Thread Simon Riggs
On Wed, 2011-02-16 at 11:34 +0900, Fujii Masao wrote: > On Tue, Feb 15, 2011 at 7:13 AM, Daniel Farina wrote: > > On Mon, Feb 14, 2011 at 12:48 AM, Fujii Masao wrote: > >> On Sat, Feb 12, 2011 at 8:58 AM, Daniel Farina wrote: > >>> Context diff equivalent attached. > >> > >> Thanks for the patch

Re: [HACKERS] Replication server timeout patch

2011-02-15 Thread Fujii Masao
On Tue, Feb 15, 2011 at 7:13 AM, Daniel Farina wrote: > On Mon, Feb 14, 2011 at 12:48 AM, Fujii Masao wrote: >> On Sat, Feb 12, 2011 at 8:58 AM, Daniel Farina wrote: >>> Context diff equivalent attached. >> >> Thanks for the patch! >> >> As I said before, the timeout which this patch provides do

Re: [HACKERS] Replication server timeout patch

2011-02-15 Thread Robert Haas
On Mon, Feb 14, 2011 at 5:13 PM, Daniel Farina wrote: > On Mon, Feb 14, 2011 at 12:48 AM, Fujii Masao wrote: >> On Sat, Feb 12, 2011 at 8:58 AM, Daniel Farina wrote: >>> Context diff equivalent attached. >> >> Thanks for the patch! >> >> As I said before, the timeout which this patch provides do

Re: [HACKERS] Replication server timeout patch

2011-02-14 Thread Simon Riggs
On Mon, 2011-02-14 at 14:13 -0800, Daniel Farina wrote: > On Mon, Feb 14, 2011 at 12:48 AM, Fujii Masao wrote: > > On Sat, Feb 12, 2011 at 8:58 AM, Daniel Farina wrote: > >> Context diff equivalent attached. > > > > Thanks for the patch! > > > > As I said before, the timeout which this patch prov

Re: [HACKERS] Replication server timeout patch

2011-02-14 Thread Daniel Farina
On Mon, Feb 14, 2011 at 12:48 AM, Fujii Masao wrote: > On Sat, Feb 12, 2011 at 8:58 AM, Daniel Farina wrote: >> Context diff equivalent attached. > > Thanks for the patch! > > As I said before, the timeout which this patch provides doesn't work well > when the walsender gets blocked in sending WA

Re: [HACKERS] Replication server timeout patch

2011-02-14 Thread Fujii Masao
On Sat, Feb 12, 2011 at 8:58 AM, Daniel Farina wrote: > Context diff equivalent attached. Thanks for the patch! As I said before, the timeout which this patch provides doesn't work well when the walsender gets blocked in sending WAL. At first, we would need to implement a non-blocking write func

Re: [HACKERS] Replication server timeout patch

2011-02-11 Thread Daniel Farina
On Feb 11, 2011 8:20 PM, "Robert Haas" wrote: > > On Fri, Feb 11, 2011 at 4:38 PM, Robert Haas wrote: > > On Fri, Feb 11, 2011 at 4:30 PM, Heikki Linnakangas > > wrote: > >> On 11.02.2011 22:11, Robert Haas wrote: > >>> > >>> On Fri, Feb 11, 2011 at 2:02 PM, Daniel Farina wrote: > > I

Re: [HACKERS] Replication server timeout patch

2011-02-11 Thread Robert Haas
On Fri, Feb 11, 2011 at 4:38 PM, Robert Haas wrote: > On Fri, Feb 11, 2011 at 4:30 PM, Heikki Linnakangas > wrote: >> On 11.02.2011 22:11, Robert Haas wrote: >>> >>> On Fri, Feb 11, 2011 at 2:02 PM, Daniel Farina  wrote: I split this out of the synchronous replication patch for independ

Re: [HACKERS] Replication server timeout patch

2011-02-11 Thread Daniel Farina
On Fri, Feb 11, 2011 at 12:11 PM, Robert Haas wrote: > On Fri, Feb 11, 2011 at 2:02 PM, Daniel Farina wrote: >> I split this out of the synchronous replication patch for independent >> review. I'm dashing out the door, so I haven't put it on the CF yet or >> anything, but I just wanted to get it

Re: [HACKERS] Replication server timeout patch

2011-02-11 Thread Robert Haas
On Fri, Feb 11, 2011 at 4:30 PM, Heikki Linnakangas wrote: > On 11.02.2011 22:11, Robert Haas wrote: >> >> On Fri, Feb 11, 2011 at 2:02 PM, Daniel Farina  wrote: >>> >>> I split this out of the synchronous replication patch for independent >>> review. I'm dashing out the door, so I haven't put it

Re: [HACKERS] Replication server timeout patch

2011-02-11 Thread Heikki Linnakangas
On 11.02.2011 22:11, Robert Haas wrote: On Fri, Feb 11, 2011 at 2:02 PM, Daniel Farina wrote: I split this out of the synchronous replication patch for independent review. I'm dashing out the door, so I haven't put it on the CF yet or anything, but I just wanted to get it out there...I'll be ar

[HACKERS] Replication server timeout patch

2011-02-11 Thread Daniel Farina
Hello list, I split this out of the synchronous replication patch for independent review. I'm dashing out the door, so I haven't put it on the CF yet or anything, but I just wanted to get it out there...I'll be around in Not Too Long to finish any other details. -- fdr *** a/doc/src/sgml/config.s

Re: [HACKERS] Replication server timeout patch

2011-02-11 Thread Robert Haas
On Fri, Feb 11, 2011 at 2:02 PM, Daniel Farina wrote: > I split this out of the synchronous replication patch for independent > review. I'm dashing out the door, so I haven't put it on the CF yet or > anything, but I just wanted to get it out there...I'll be around in > Not Too Long to finish any

[HACKERS] Replication server timeout patch

2011-02-11 Thread Daniel Farina
Hello list, I split this out of the synchronous replication patch for independent review. I'm dashing out the door, so I haven't put it on the CF yet or anything, but I just wanted to get it out there...I'll be around in Not Too Long to finish any other details. -- fdr *** a/doc/src/sgml/config.s