On Mon, Nov 14, 2022 at 12:11 PM Thomas Munro <thomas.mu...@gmail.com> wrote:
> On Mon, Nov 14, 2022 at 11:26 AM Nathan Bossart
> <nathandboss...@gmail.com> wrote:
> > On Sun, Nov 13, 2022 at 05:08:04PM -0500, Tom Lane wrote:
> > > There is something very seriously wrong with this patch.
> > >
> > > On my machine, running "make -j10 check-world" (with compilation
> > > already done) has been taking right about 2 minutes for some time.
> > > Since this patch, it's taking around 2:45 --- I did a bisect run
> > > to confirm that this patch is where it changed.
> >
> > I've been looking into this.  I wrote a similar patch for logical/worker.c
> > before noticing that check-world was taking much longer.  The problem in
> > that case seems to be that process_syncing_tables() isn't called as often.
> > It wouldn't surprise me if there's also something in walreceiver.c that
> > depends upon the frequent wakeups.  I suspect this will require a revert.
>
> In the case of "meson test pg_basebackup/020_pg_receivewal" it looks
> like wait_for_catchup hangs around for 10 seconds waiting for HS
> feedback.  I'm wondering if we need to go back to high frequency
> wakeups until it's caught up, or (probably better) arrange for a
> proper event for progress.  Digging...

Maybe there is a better way to code this (I mean, who likes global
variables?) and I need to test some more, but I suspect the attached
is approximately what we missed.
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 8bd2ba37dd..fed2cc6e6f 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1080,6 +1080,9 @@ XLogWalRcvClose(XLogRecPtr recptr, TimeLineID tli)
 	recvFile = -1;
 }
 
+static XLogRecPtr writePtr = 0;
+static XLogRecPtr flushPtr = 0;
+
 /*
  * Send reply message to primary, indicating our current WAL locations, oldest
  * xmin and the current time.
@@ -1096,8 +1099,6 @@ XLogWalRcvClose(XLogRecPtr recptr, TimeLineID tli)
 static void
 XLogWalRcvSendReply(bool force, bool requestReply)
 {
-	static XLogRecPtr writePtr = 0;
-	static XLogRecPtr flushPtr = 0;
 	XLogRecPtr	applyPtr;
 	TimestampTz now;
 
@@ -1334,6 +1335,9 @@ WalRcvComputeNextWakeup(WalRcvWakeupReason reason, TimestampTz now)
 		case WALRCV_WAKEUP_REPLY:
 			if (wal_receiver_status_interval <= 0)
 				wakeup[reason] = PG_INT64_MAX;
+			else if (writePtr != LogstreamResult.Write ||
+					 flushPtr != LogstreamResult.Flush)
+				wakeup[reason] = now + 100000;	/* frequent replies, not yet caught up */
 			else
 				wakeup[reason] = now + wal_receiver_status_interval * INT64CONST(1000000);
 			break;

Reply via email to