Re: [HACKERS] pg_basebackup may fail to send feedbacks.

Kyotaro HORIGUCHI Wed, 18 Feb 2015 00:37:13 -0800

Hello, this is the last patch for pg_basebackup/pg_receivexlog on
master (9.5). Preor versions don't have this issue.


4. basebackup_reply_fix_mst_v2.patch
  receivelog.c patch applyable on master.

This is based on the same design with
walrcv_reply_fix_91_v2.patch in the aspect of gettimeofday().

regards,

At Tue, 17 Feb 2015 19:45:19 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
<[email protected]> wrote in 
<[email protected]>
> Thank you for the comment. Three new patches are attached. I
> forgot to give a revision number on the previous patch, but I
> think this is the 2nd version.
> 
> 1. walrcv_reply_fix_94_v2.patch
>    Walreceiver patch applyable on master/REL9_4_STBLE/REL9_3_STABLE
> 
> 2. walrcv_reply_fix_92_v2.patch
>    Walreceiver patch applyable on REL9_2_STABLE
> 
> 3. walrcv_reply_fix_91_v2.patch
>    Walreceiver patch applyable on REL9_1_STABLE
> 
> 
> At Sat, 14 Feb 2015 03:01:22 +0900, Fujii Masao <[email protected]> wrote 
> in <cahgqgwhz_4ywyoka+5ks9s_698adjh8+0hamcsc9wyfh37g...@mail.gmail.com>
> > On Tue, Feb 10, 2015 at 7:48 PM, Kyotaro HORIGUCHI
> > <[email protected]> wrote:
> > >> Introduce the maximum number of records that we can receive and
> > >> process between feedbacks? For example, we can change
> > >> XLogWalRcvSendHSFeedback() so that it's called per at least
> > >> 8 records. I'm not sure what number is good, though...
> > >
> > > At the beginning of the "while(len > 0){if (len > 0){" block,
> > > last_recv_timestamp is always kept up to date (by using
> > > gettimeotda():). So we can use the variable instead of
> > > gettimeofday() in my previous proposal.
> > 
> > Yes, but something like last_recv_timestamp doesn't exist in
> > REL9_1_STABLE. So you don't think that your proposed fix
> > should be back-patched to all supported versions. Right?
> 
> Back to 9.3 has the loop with the same structure. 9.2 is in a bit
> differenct shape but looks to have the same issue. 9.1 and 9.0
> has the same staps with 9.2 but 9.0 doesn't has wal sender
> timeout and 9.1 does not have a variable having current time.
> 
> 9.3 and later: the previous patch works, but revised as your
>   comment.
> 
> 9.2: The time of the last reply is stored in the file-scope
>   variable reply_message, and the current time is stored in
>   walrcv->lastMsgReceiptTime. The timeout is determined using
>   theses variables.
> 
> 9.1: walrcv doesn't have lastMsgReceiptTime. The current time
>   should be acquired explicitly in the innermost loop. I tried
>   calling gettimeofday() once per several loops. The skip count
>   is determined by anticipated worst duration to process a wal
>   record and wal_receiver_status_interval. However, I doubt the
>   necessity to do so.. The value of the worst duration is simply
>   a random guess.
> 
> 9.0: No point to do this kind of fix.
> 
> 
> > > The start time of the timeout could be last_recv_timestamp at the
> > > beginning of the while loop, since it is a bit earlier than the
> > > actual time at the point.
> > 
> > Sounds strange to me. I think that last_recv_timestamp should be
> > compared with the time when the last feedback message was sent,
> > e.g., maybe you can expose sendTime in XLogWalRcvSendReplay()
> > as global variable, and use it to compare with last_recv_timestamp.
> 
> You're right to do exactly right thing, but, as you mentioned,
> exposing sendTime is requied to do so. The solution I proposed is
> the second-best assuming that wal_sender_timeout is recommended
> to be longer enough (several times longer) than
> wal_receiver_status_interval. If no one minds to expose sendTime,
> it is the best solution. The attached patch does it.
> 
> # The added comment in the patch was wrong, though.
> 
> > When we get out of the WAL receive loop due to the timeout check
> > which your patch added, XLogWalRcvFlush() is always executed.
> > I don't think that current WAL file needs to be flushed at that time.
> 
> Thank you for pointing it out. Run XLogWalRcvSendReply(force =
> true) immediately instead of breaking from the loop.
> 
> > > The another solution would be using
> > > RegisterTimeout/enable_timeout_after and it seemed to be work for
> > > me. It can throw out the time counting stuff in the loop we are
> > > talking about and that of XLogWalRcvSendHSFeedback and
> > > XLogWalRcvSendReply, but it might be a bit too large for the
> > > gain.
> > 
> > Yes, sounds overkill.
> 
> However, gettimeofday() for each recv is also a overkill for most
> cases. I'll post the patches for receivelog.c based on the patch
> for 9.1 wal receiver.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

>From cfe01eadd4d8567f4410bccb8334c52fc897c002 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <[email protected]>
Date: Wed, 18 Feb 2015 12:30:58 +0900
Subject: [PATCH] Make pg_bascbackup and pg_receivexlog to keep sending WAL
 receive feedback regularly on heavy load v2.

pg_basebackup and pg_receivexlog fail to send receiver reply message
while they are receiving continuous WAL stream caused by heavy load or
something else. This patch makes them to send reply message even on
such a situation.
---
 src/bin/pg_basebackup/receivelog.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index 8caedff..453a047 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -24,6 +24,10 @@
 #include "libpq-fe.h"
 #include "access/xlog_internal.h"
 
+#define WAL_PROCESS_WORST_DURATION 1 /* Anticipated worst duration to process
+									  * wal record in seconds. This is used to
+									  * calculate how often to check the time
+									  * to send reply message */
 
 /* fd and filename for currently open WAL file */
 static int	walfile = -1;
@@ -826,6 +830,9 @@ HandleCopyStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline,
 		int			r;
 		int64		now;
 		long		sleeptime;
+		int			reply_timeout_check_interval = 
+			standby_message_timeout / 1000 / WAL_PROCESS_WORST_DURATION;
+		int			loop_count = 0;
 
 		/*
 		 * Check if we should continue streaming, or abort at this point.
@@ -879,6 +886,7 @@ HandleCopyStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline,
 		sleeptime = CalculateCopyStreamSleeptime(now, standby_message_timeout,
 												 last_status);
 
+		loop_count = 0;
 		r = CopyStreamReceive(conn, sleeptime, &copybuf);
 		while (r != 0)
 		{
@@ -925,6 +933,23 @@ HandleCopyStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline,
 			}
 
 			/*
+			 * Keep sending feedbacks regularly. We check it once per several
+			 * iterations for performance reason.
+			 */
+			if (loop_count++ >= reply_timeout_check_interval)
+			{
+				now = feGetCurrentTimestamp();
+				if (feTimestampDifferenceExceeds(last_status, now,
+												 standby_message_timeout))
+				{
+					if (!sendFeedback(conn, blockpos, now, false))
+						goto error;
+					last_status = now;
+				}
+				loop_count = 0;
+			}
+
+			/*
 			 * Process the received data, and any subsequent data we
 			 * can read without blocking.
 			 */
-- 
2.1.0.GIT

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup may fail to send feedbacks.

Reply via email to