Ashutosh Bapat noticed that WalSndWaitForWal() is setting
waiting_for_ping_response after sending a keepalive that does *not*
request a reply.  The bad consequence is that other callers that do
require a reply end up in not sending a keepalive, because they think it
was already sent previously.  So the whole thing gets stuck.

He found that commit 41d5f8ad734 failed to remove the setting of
waiting_for_ping_response after changing the "request" parameter
WalSndKeepalive from true to false; that seems to have been an omission
and it breaks the algorithm.  Thread at [1].

The simplest fix is just to remove the line that sets
waiting_for_ping_response, but I think it is less error-prone to have
WalSndKeepalive set the flag itself, instead of expecting its callers to
do it (and know when to).  Patch attached.  Also rewords some related
commentary.

[1] https://postgr.es/m/flat/blu436-smtp25712b7ef9fc2adeb87c522dc...@phx.gbl

-- 
Álvaro Herrera       Valdivia, Chile
>From bf5ca106817744c1ba300ecb8c86d230789988e0 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvhe...@alvh.no-ip.org>
Date: Thu, 6 Aug 2020 14:57:09 -0400
Subject: [PATCH] Fix waiting_for_ping in walsender

---
 src/backend/replication/walsender.c | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 5e2210dd7b..c239902e0e 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -151,7 +151,7 @@ static XLogRecPtr sendTimeLineValidUpto = InvalidXLogRecPtr;
  * How far have we sent WAL already? This is also advertised in
  * MyWalSnd->sentPtr.  (Actually, this is the next WAL location to send.)
  */
-static XLogRecPtr sentPtr = 0;
+static XLogRecPtr sentPtr = InvalidXLogRecPtr;
 
 /* Buffers for constructing outgoing messages and processing reply messages. */
 static StringInfoData output_message;
@@ -1444,17 +1444,13 @@ WalSndWaitForWal(XLogRecPtr loc)
 		 * We only send regular messages to the client for full decoded
 		 * transactions, but a synchronous replication and walsender shutdown
 		 * possibly are waiting for a later location. So, before sleeping, we
-		 * send a ping containing the flush location. If the receiver is
-		 * otherwise idle, this keepalive will trigger a reply. Processing the
-		 * reply will update these MyWalSnd locations.
+		 * send a ping containing the flush location. A reply from standby is
+		 * not needed and would be wasteful.
 		 */
 		if (MyWalSnd->flush < sentPtr &&
 			MyWalSnd->write < sentPtr &&
 			!waiting_for_ping_response)
-		{
 			WalSndKeepalive(false);
-			waiting_for_ping_response = true;
-		}
 
 		/* check whether we're done */
 		if (loc <= RecentFlushPtr)
@@ -2932,10 +2928,7 @@ WalSndDone(WalSndSendDataCallback send_data)
 		proc_exit(0);
 	}
 	if (!waiting_for_ping_response)
-	{
 		WalSndKeepalive(true);
-		waiting_for_ping_response = true;
-	}
 }
 
 /*
@@ -3432,10 +3425,13 @@ pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
 }
 
 /*
-  * This function is used to send a keepalive message to standby.
-  * If requestReply is set, sets a flag in the message requesting the standby
-  * to send a message back to us, for heartbeat purposes.
-  */
+ * Send a keepalive message to standby.
+ *
+ * If requestReply is set, the message requests the other party to send
+ * a message back to us, for heartbeat purposes.  We also set a flag to
+ * let nearby code that we're waiting for that response, to avoid
+ * repeated requests.
+ */
 static void
 WalSndKeepalive(bool requestReply)
 {
@@ -3450,6 +3446,10 @@ WalSndKeepalive(bool requestReply)
 
 	/* ... and send it wrapped in CopyData */
 	pq_putmessage_noblock('d', output_message.data, output_message.len);
+
+	/* Set local flag */
+	if (requestReply)
+		waiting_for_ping_response = true;
 }
 
 /*
@@ -3480,7 +3480,6 @@ WalSndKeepaliveIfNecessary(void)
 	if (last_processing >= ping_time)
 	{
 		WalSndKeepalive(true);
-		waiting_for_ping_response = true;
 
 		/* Try to flush pending output to the client */
 		if (pq_flush_if_writable() != 0)
-- 
2.20.1

Reply via email to