On Wed, Feb 10, 2021 at 10:02 AM Dilip Kumar <dilipbal...@gmail.com> wrote:
>
> I don't find any problem with this approach as well, but I personally
> feel that the other approach where we don't wait in any API and just
> return the recovery pause state is much simpler and more flexible.  So
> I will make the pending changes in that patch and let's see what are
> the other opinion and based on that we can conclude.  Thanks for the
> patch.

Here is an updated version of the patch which fixes the last two open problems
1. In RecoveryRequiresIntParameter set the recovery pause state in the
loop so that if recovery resumed and pause requested again we can set
to pause again.
2. If the recovery state is already 'paused' then don't set it back to
the 'pause requested'.

One more point is that in 'pg_wal_replay_pause' even if we don't
change the state because it was already set to the 'paused' then also
we call the WakeupRecovery.  But I don't think there is any problem
with that, if we think that this should be changed then we can make
SetRecoveryPause return a bool such that if it doesn't do state change
then it returns false and in that case we can avoid calling
WakeupRecovery, but I felt that is unnecessary.  Any other thoughts on
this?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From 4f7f3e11c3c6354225ddb80bb06b503e940cd8ff Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Wed, 27 Jan 2021 16:46:04 +0530
Subject: [PATCH v13] Provide a new interface to get the recovery pause status

Currently, pg_is_wal_replay_paused, just checks whether the recovery
pause is requested or not but it doesn't actually tell whether the
recovery is actually paused or not.  So basically there is no way for
the user to know the actual status of the pause request.  This patch
provides a new interface pg_get_wal_replay_pause_state that will
return the actual status of the recovery pause i.e.'not paused' if
pause is not requested, 'pause requested' if pause is requested but
recovery is not yet paused and 'paused' if recovery is actually paused.
---
 doc/src/sgml/func.sgml                 | 32 +++++++++++--
 src/backend/access/transam/xlog.c      | 86 +++++++++++++++++++++++++++-------
 src/backend/access/transam/xlogfuncs.c | 50 ++++++++++++++++++--
 src/include/access/xlog.h              | 12 ++++-
 src/include/catalog/pg_proc.dat        |  4 ++
 5 files changed, 156 insertions(+), 28 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1ab31a9..9b2c429 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25285,7 +25285,24 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
         <returnvalue>boolean</returnvalue>
        </para>
        <para>
-        Returns true if recovery is paused.
+        Returns true if recovery pause is requested.
+       </para></entry>
+      </row>
+
+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>pg_get_wal_replay_pause_state</primary>
+        </indexterm>
+        <function>pg_get_wal_replay_pause_state</function> ()
+        <returnvalue>text</returnvalue>
+       </para>
+       <para>
+        Returns recovery pause state.  The return values are <literal>
+        not paused</literal> if pause is not requested, <literal>
+        pause requested</literal> if pause is requested but recovery is
+        not yet paused and, <literal>paused</literal> if the recovery is
+        actually paused.
        </para></entry>
       </row>
 
@@ -25324,10 +25341,15 @@ postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
         <returnvalue>void</returnvalue>
        </para>
        <para>
-        Pauses recovery.  While recovery is paused, no further database
-        changes are applied.  If hot standby is active, all new queries will
-        see the same consistent snapshot of the database, and no further query
-        conflicts will be generated until recovery is resumed.
+        Request to pause recovery.  A request doesn't mean that recovery stops
+        right away.  If you want a guarantee that recovery is actually paused,
+        you need to check for the recovery pause state returned by
+        <function>pg_get_wal_replay_pause_state()</function>.  Note that
+        <function>pg_is_wal_replay_paused()</function> returns whether a request
+        is made.  While recovery is paused, no further database changes are applied.
+        If hot standby is active, all new queries will see the same consistent
+        snapshot of the database, and no further query conflicts will be generated
+        until recovery is resumed.
        </para>
        <para>
         This function is restricted to superusers by default, but other users
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 8e3b5df..459a19b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -721,8 +721,8 @@ typedef struct XLogCtlData
 	 * only relevant for replication or archive recovery
 	 */
 	TimestampTz currentChunkStartTime;
-	/* Are we requested to pause recovery? */
-	bool		recoveryPause;
+	/* Recovery pause state */
+	RecoveryPauseState	recoveryPauseState;
 
 	/*
 	 * lastFpwDisableRecPtr points to the start of the last replayed
@@ -894,6 +894,7 @@ static void validateRecoveryParameters(void);
 static void exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog);
 static bool recoveryStopsBefore(XLogReaderState *record);
 static bool recoveryStopsAfter(XLogReaderState *record);
+static void CheckAndSetRecoveryPause(void);
 static void recoveryPausesHere(bool endOfRecovery);
 static bool recoveryApplyDelay(XLogReaderState *record);
 static void SetLatestXTime(TimestampTz xtime);
@@ -6019,7 +6020,20 @@ recoveryStopsAfter(XLogReaderState *record)
 }
 
 /*
- * Wait until shared recoveryPause flag is cleared.
+ * If recovery pause is requested then set its state as paused.
+ */
+static void
+CheckAndSetRecoveryPause(void)
+{
+	/* If recovery pause is requested then set it paused */
+	SpinLockAcquire(&XLogCtl->info_lck);
+	if (XLogCtl->recoveryPauseState == RECOVERY_PAUSE_REQUESTED)
+		XLogCtl->recoveryPauseState = RECOVERY_PAUSED;
+	SpinLockRelease(&XLogCtl->info_lck);
+}
+
+/*
+ * Wait until shared recoveryPauseState is set to RECOVERY_NOT_PAUSED.
  *
  * endOfRecovery is true if the recovery target is reached and
  * the paused state starts at the end of recovery because of
@@ -6049,34 +6063,58 @@ recoveryPausesHere(bool endOfRecovery)
 				(errmsg("recovery has paused"),
 				 errhint("Execute pg_wal_replay_resume() to continue.")));
 
-	while (RecoveryIsPaused())
+	/* loop until recoveryPauseState is set to RECOVERY_NOT_PAUSED */
+	while (GetRecoveryPauseState() != RECOVERY_NOT_PAUSED)
 	{
 		HandleStartupProcInterrupts();
+
 		if (CheckForStandbyTrigger())
 			return;
 		pgstat_report_wait_start(WAIT_EVENT_RECOVERY_PAUSE);
+
+		/*
+		 * If recovery pause is requested then set it paused.  While we are in
+		 * the loop, user might resume and pause again so set this every time.
+		 */
+		CheckAndSetRecoveryPause();
+
 		pg_usleep(1000000L);	/* 1000 ms */
 		pgstat_report_wait_end();
 	}
 }
 
-bool
-RecoveryIsPaused(void)
+/*
+ * Get the current state of the recovery pause request.
+ */
+RecoveryPauseState
+GetRecoveryPauseState(void)
 {
-	bool		recoveryPause;
+	RecoveryPauseState	state;
 
 	SpinLockAcquire(&XLogCtl->info_lck);
-	recoveryPause = XLogCtl->recoveryPause;
+	state = XLogCtl->recoveryPauseState;
 	SpinLockRelease(&XLogCtl->info_lck);
 
-	return recoveryPause;
+	return state;
 }
 
+/*
+ * Set the recovery pause state.
+ */
 void
-SetRecoveryPause(bool recoveryPause)
+SetRecoveryPause(RecoveryPauseState state)
 {
+	if (state < RECOVERY_NOT_PAUSED || state > RECOVERY_PAUSED)
+		elog(ERROR, "invalid recovery pause state %d", state);
+
+	/*
+	 * If the recovery state is RECOVERY_PAUSED then no need to set it to back
+	 * to the RECOVERY_PAUSE_REQUESTED.
+	 */
 	SpinLockAcquire(&XLogCtl->info_lck);
-	XLogCtl->recoveryPause = recoveryPause;
+	if (XLogCtl->recoveryPauseState != RECOVERY_PAUSED ||
+		state != RECOVERY_PAUSE_REQUESTED)
+		XLogCtl->recoveryPauseState = state;
 	SpinLockRelease(&XLogCtl->info_lck);
 }
 
@@ -6270,14 +6308,14 @@ RecoveryRequiresIntParameter(const char *param_name, int currValue, int minValue
 							   currValue,
 							   minValue)));
 
-			SetRecoveryPause(true);
+			SetRecoveryPause(RECOVERY_PAUSED);
 
 			ereport(LOG,
 					(errmsg("recovery has paused"),
 					 errdetail("If recovery is unpaused, the server will shut down."),
 					 errhint("You can then restart the server after making the necessary configuration changes.")));
 
-			while (RecoveryIsPaused())
+			while (GetRecoveryPauseState() != RECOVERY_NOT_PAUSED)
 			{
 				HandleStartupProcInterrupts();
 
@@ -6296,6 +6334,13 @@ RecoveryRequiresIntParameter(const char *param_name, int currValue, int minValue
 					warned_for_promote = true;
 				}
 
+				/*
+				 * If recovery pause is requested then set it paused.  While we
+				 * are in the loop, user might resume and pause again so set
+				 * this every time.
+				 */
+				CheckAndSetRecoveryPause();
+
 				pgstat_report_wait_start(WAIT_EVENT_RECOVERY_PAUSE);
 				pg_usleep(1000000L);	/* 1000 ms */
 				pgstat_report_wait_end();
@@ -7194,7 +7239,7 @@ StartupXLOG(void)
 		XLogCtl->lastReplayedTLI = XLogCtl->replayEndTLI;
 		XLogCtl->recoveryLastXTime = 0;
 		XLogCtl->currentChunkStartTime = 0;
-		XLogCtl->recoveryPause = false;
+		XLogCtl->recoveryPauseState = RECOVERY_NOT_PAUSED;
 		SpinLockRelease(&XLogCtl->info_lck);
 
 		/* Also ensure XLogReceiptTime has a sane value */
@@ -7298,7 +7343,8 @@ StartupXLOG(void)
 				 * otherwise would is a minor issue, so it doesn't seem worth
 				 * adding another spinlock cycle to prevent that.
 				 */
-				if (((volatile XLogCtlData *) XLogCtl)->recoveryPause)
+				if (((volatile XLogCtlData *) XLogCtl)->recoveryPauseState ==
+					RECOVERY_PAUSE_REQUESTED)
 					recoveryPausesHere(false);
 
 				/*
@@ -7323,7 +7369,8 @@ StartupXLOG(void)
 					 * here otherwise pausing during the delay-wait wouldn't
 					 * work.
 					 */
-					if (((volatile XLogCtlData *) XLogCtl)->recoveryPause)
+					if (((volatile XLogCtlData *) XLogCtl)->recoveryPauseState ==
+						RECOVERY_PAUSE_REQUESTED)
 						recoveryPausesHere(false);
 				}
 
@@ -7497,7 +7544,7 @@ StartupXLOG(void)
 						proc_exit(3);
 
 					case RECOVERY_TARGET_ACTION_PAUSE:
-						SetRecoveryPause(true);
+						SetRecoveryPause(RECOVERY_PAUSE_REQUESTED);
 						recoveryPausesHere(true);
 
 						/* drop into promote */
@@ -12624,6 +12671,11 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 				elog(ERROR, "unexpected WAL source %d", currentSource);
 		}
 
+		/* test for recovery pause, if user has requested the pause */
+		if (((volatile XLogCtlData *) XLogCtl)->recoveryPauseState ==
+			RECOVERY_PAUSE_REQUESTED)
+			recoveryPausesHere(false);
+
 		/*
 		 * This possibly-long loop needs to handle interrupts of startup
 		 * process.
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index d8c5bf6..814295c 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -517,7 +517,7 @@ pg_walfile_name(PG_FUNCTION_ARGS)
 }
 
 /*
- * pg_wal_replay_pause - pause recovery now
+ * pg_wal_replay_pause - Request to pause recovery
  *
  * Permission checking for this function is managed through the normal
  * GRANT system.
@@ -538,7 +538,10 @@ pg_wal_replay_pause(PG_FUNCTION_ARGS)
 				 errhint("%s cannot be executed after promotion is triggered.",
 						 "pg_wal_replay_pause()")));
 
-	SetRecoveryPause(true);
+	SetRecoveryPause(RECOVERY_PAUSE_REQUESTED);
+
+	/* wake up the recovery process so that it can process the pause request */
+	WakeupRecovery();
 
 	PG_RETURN_VOID();
 }
@@ -565,7 +568,7 @@ pg_wal_replay_resume(PG_FUNCTION_ARGS)
 				 errhint("%s cannot be executed after promotion is triggered.",
 						 "pg_wal_replay_resume()")));
 
-	SetRecoveryPause(false);
+	SetRecoveryPause(RECOVERY_NOT_PAUSED);
 
 	PG_RETURN_VOID();
 }
@@ -582,7 +585,46 @@ pg_is_wal_replay_paused(PG_FUNCTION_ARGS)
 				 errmsg("recovery is not in progress"),
 				 errhint("Recovery control functions can only be executed during recovery.")));
 
-	PG_RETURN_BOOL(RecoveryIsPaused());
+	PG_RETURN_BOOL(GetRecoveryPauseState() != RECOVERY_NOT_PAUSED);
+}
+
+/*
+ * pg_get_wal_replay_pause_state - Returns the recovery pause state.
+ *
+ * Returned values:
+ *
+ * 'not paused' - if pause is not requested
+ * 'pause requested' - if pause is requested but recovery is not yet paused
+ * 'paused' - if recovery is paused
+ */
+Datum
+pg_get_wal_replay_pause_state(PG_FUNCTION_ARGS)
+{
+	char	*state;
+
+	if (!RecoveryInProgress())
+		ereport(ERROR,
+				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+				 errmsg("recovery is not in progress"),
+				 errhint("Recovery control functions can only be executed during recovery.")));
+
+	/* get the recovery pause state */
+	switch(GetRecoveryPauseState())
+	{
+		case RECOVERY_NOT_PAUSED:
+			state = "not paused";
+			break;
+		case RECOVERY_PAUSE_REQUESTED:
+			state = "pause requested";
+			break;
+		case RECOVERY_PAUSED:
+			state = "paused";
+			break;
+		default:
+			elog(ERROR, "invalid recovery pause state");
+	}
+
+	PG_RETURN_TEXT_P(cstring_to_text(state));
 }
 
 /*
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 75ec107..7533c48 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -174,6 +174,14 @@ typedef enum RecoveryState
 	RECOVERY_STATE_DONE			/* currently in production */
 } RecoveryState;
 
+/* Recovery pause states */
+typedef enum RecoveryPauseState
+{
+	RECOVERY_NOT_PAUSED = 0,		/* pause not requested */
+	RECOVERY_PAUSE_REQUESTED = 1,	/* pause requested, but yet paused */
+	RECOVERY_PAUSED = 2				/* recovery is paused */
+} RecoveryPauseState;
+
 extern PGDLLIMPORT int wal_level;
 
 /* Is WAL archiving enabled (always or only while server is running normally)? */
@@ -310,8 +318,8 @@ extern void GetXLogReceiptTime(TimestampTz *rtime, bool *fromStream);
 extern XLogRecPtr GetXLogReplayRecPtr(TimeLineID *replayTLI);
 extern XLogRecPtr GetXLogInsertRecPtr(void);
 extern XLogRecPtr GetXLogWriteRecPtr(void);
-extern bool RecoveryIsPaused(void);
-extern void SetRecoveryPause(bool recoveryPause);
+extern RecoveryPauseState GetRecoveryPauseState(void);
+extern void SetRecoveryPause(RecoveryPauseState state);
 extern TimestampTz GetLatestXTime(void);
 extern TimestampTz GetCurrentChunkReplayStartTime(void);
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 4e0c9be..a23bf11 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -6230,6 +6230,10 @@
   proname => 'pg_is_wal_replay_paused', provolatile => 'v',
   prorettype => 'bool', proargtypes => '',
   prosrc => 'pg_is_wal_replay_paused' },
+{ oid => '1137', descr => 'get wal replay pause state',
+  proname => 'pg_get_wal_replay_pause_state', provolatile => 'v',
+  prorettype => 'text', proargtypes => '',
+  prosrc => 'pg_get_wal_replay_pause_state' },
 
 { oid => '2621', descr => 'reload configuration files',
   proname => 'pg_reload_conf', provolatile => 'v', prorettype => 'bool',
-- 
1.8.3.1

Reply via email to