On Sat, Sep 19, 2020 at 11:00:31AM -0500, Justin Pryzby wrote:
> Maybe it's a bad idea if the checkpointer is continuously changing its 
> display.
> I don't see the utility in it, since log_checkpoints does more than ps could
> ever do.  I'm concerned that would break things for someone using something
> like pgrep.

At the end of recovery, there is a code path where the startup process
triggers the checkpoint by itself if the bgwriter is not launched, but
there is also a second code path where, if the bgwriter is started and
if the cluster not promoted, the startup process would request for an
immediate checkpoint and then wait for it.  It is IMO equally
important to update the display of the checkpointer in this case to
show that the checkpointer is running an end-of-recovery checkpoint.

> Related: I have always thought that this message meant "recovery will complete
> Real Soon", but I now understand it to mean "beginning the recovery 
> checkpoint,
> which is flagged CHECKPOINT_IMMEDIATE" (and may take a long time).

Yep.  And at the end of crash recovery seconds feel like minutes.

I agree that "checkpointer checkpoint" is not the best fit.  Using
parenthesis would also be inconsistent with the other usages of this
API in the backend code.  What about adding "running" then?  This
would give "checkpointer running end-of-recovery checkpoint".

While looking at this patch, I got tempted to use a StringInfo to fill
in the string to display as that would make the addition of any extra
information easier, giving the attached.
--
Michael
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 8f11b1b9de..e25c29e66c 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -8714,6 +8714,7 @@ CreateCheckPoint(int flags)
 	XLogRecPtr	last_important_lsn;
 	VirtualTransactionId *vxids;
 	int			nvxids;
+	StringInfoData ps_status;
 
 	/*
 	 * An end-of-recovery checkpoint is really a shutdown checkpoint, just
@@ -8756,6 +8757,21 @@ CreateCheckPoint(int flags)
 	MemSet(&CheckpointStats, 0, sizeof(CheckpointStats));
 	CheckpointStats.ckpt_start_t = GetCurrentTimestamp();
 
+
+	/*
+	 * Set string to use for the ps status display.  This cannot be
+	 * built within a critical section, and the display is set when
+	 * the checkpoint is really starting.
+	 */
+	initStringInfo(&ps_status);
+	appendStringInfoString(&ps_status, "running ");
+
+	if ((flags & CHECKPOINT_END_OF_RECOVERY) != 0)
+		appendStringInfoString(&ps_status, "end-of-recovery ");
+	if ((flags & CHECKPOINT_IS_SHUTDOWN) != 0)
+		appendStringInfoString(&ps_status, "shutdown ");
+	appendStringInfoString(&ps_status, "checkpoint");
+
 	/*
 	 * Use a critical section to force system panic if we have trouble.
 	 */
@@ -8889,6 +8905,13 @@ CreateCheckPoint(int flags)
 	if (log_checkpoints)
 		LogCheckpointStart(flags, false);
 
+	/*
+	 * Set ps display status, step postponed until now as the checkpoint
+	 * could have been skipped.
+	 */
+	set_ps_display(ps_status.data);
+	pfree(ps_status.data);
+
 	TRACE_POSTGRESQL_CHECKPOINT_START(flags);
 
 	/*
@@ -9104,6 +9127,13 @@ CreateCheckPoint(int flags)
 	/* Real work is done, but log and update stats before releasing lock. */
 	LogCheckpointEnd(false);
 
+	/*
+	 * Reset the ps status display.  "idle" or similar is not used here
+	 * as this can be called in the startup process, which could be
+	 * confusing.
+	 */
+	set_ps_display("");
+
 	TRACE_POSTGRESQL_CHECKPOINT_DONE(CheckpointStats.ckpt_bufs_written,
 									 NBuffers,
 									 CheckpointStats.ckpt_segs_added,
@@ -9266,6 +9296,7 @@ CreateRestartPoint(int flags)
 	XLogRecPtr	endptr;
 	XLogSegNo	_logSegNo;
 	TimestampTz xtime;
+	StringInfoData ps_status;
 
 	/*
 	 * Acquire CheckpointLock to ensure only one restartpoint or checkpoint
@@ -9358,6 +9389,16 @@ CreateRestartPoint(int flags)
 	if (log_checkpoints)
 		LogCheckpointStart(flags, true);
 
+	/* Set ps status display for the process running the checkpoint */
+	initStringInfo(&ps_status);
+	appendStringInfoString(&ps_status, "running ");
+	if ((flags & CHECKPOINT_IS_SHUTDOWN) != 0)
+		appendStringInfoString(&ps_status, "shutdown ");
+	appendStringInfoString(&ps_status, "restartpoint");
+
+	set_ps_display(ps_status.data);
+	pfree(ps_status.data);
+
 	CheckPointGuts(lastCheckPoint.redo, flags);
 
 	/*
@@ -9476,6 +9517,13 @@ CreateRestartPoint(int flags)
 	/* Real work is done, but log and update before releasing lock. */
 	LogCheckpointEnd(true);
 
+	/*
+	 * Reset the ps status display.  "idle" or similar is not used here
+	 * as this can be called in the startup process, which could be
+	 * confusing.
+	 */
+	set_ps_display("");
+
 	xtime = GetLatestXTime();
 	ereport((log_checkpoints ? LOG : DEBUG2),
 			(errmsg("recovery restart point at %X/%X",

Attachment: signature.asc
Description: PGP signature

Reply via email to