At Wed, 3 Mar 2021 11:14:20 -0500, David Steele <da...@pgmasters.net> wrote in 
> Hi Kyotaro,
> 
> On 3/27/20 10:25 PM, James Coleman wrote:
> > On Thu, Mar 26, 2020 at 12:41 PM Robert Haas <robertmh...@gmail.com>
> > wrote:
> >>
> >> I'm just spitballing here, but it would be really good if there's a
> >> way to know definitely whether or not you should be scared. Corrupted
> >> WAL segments are definitely a thing that happens, but retries are a
> >> lot more common.
> > First, I agree that getting enough context to say precisely is by far
> > the ideal.
> > That being said, as an end user who's found this surprising -- and
> > momentarily scary every time I initially scan it even though I *know
> > intellectually it's not* -- I would find Peter's suggestion a
> > significant improvement over what we have now. I'm fairly certainly my
> > co-workers on our database team would also. Knowing that something is
> > at least not always scary is good. Though I'll grant that this does
> > have the negative in reverse: if it actually is a scary
> > situation...this mutes your concern level. On the other hand,
> > monitoring would tell us if there's a real problem (namely replication
> > lag), so I think the trade-off is clearly worth it.
> > How about this minor tweak:
> > HINT:  This is expected if this is the end of currently available WAL.
> > Otherwise, it could indicate corruption.
> 
> Any thoughts on the suggestions for making the messaging clearer?
> 
> Also, the patch no longer applies:
> http://cfbot.cputube.org/patch_32_2490.log.

Sorry for missing the last discussions.  I agree to the point about
really-scary situation.

ValidXLogRecordHeader deliberately marks End-Of-WAL only in the case
of zero-length record so that the callers can identify that case,
instead of inferring the EOW state without it. All other invalid data
is treated as potentially danger situation.  I think this is a
reasonable classification. And the error level for the "danger" cases
is changed to WARNING (from LOG).


As the result, the following messages are emitted with the attached.

- found zero-length record during recovery (the DETAIL might not be needed.)

> LOG:  redo starts at 0/14000118
> LOG:  reached end of WAL at 0/14C5D070 on timeline 1 in pg_wal during crash 
> recovery
> DETAIL:  record length is 0 at 0/14C5D070
> LOG:  redo done at 0/14C5CF48 system usage: ...

- found another kind of invalid data

> LOG:  redo starts at 0/150000A0
> WARNING:  invalid record length at 0/1500CA60: wanted 24, got 54
> LOG:  redo done at 0/1500CA28 system usage: ...


On the way checking the patch, I found that it emits the following log
lines in the case the redo loop meets an invalid record at the
starting:

> LOG:  invalid record length at 0/10000118: wanted 24, got 42
> LOG:  redo is not required

which doesn't look proper.  That case is identifiable using the
End-of_WAL flag this patch adds. Thus we get the following error
messages.


- found end-of-wal at the beginning of recovery

> LOG:  reached end of WAL at 0/130000A0 on timeline 1 in pg_wal during crash 
> recovery
> DETAIL:  record length is 0 at 0/130000A0
> LOG:  redo is not required

- found invalid data

> WARNING:  invalid record length at 0/120000A0: wanted 24, got 42
> WARNING:  redo is skipped
> HINT:  This suggests WAL file corruption. You might need to check the 
> database.

The logic of ErrRecPtr in ReadRecord may wrong.  I remember having
such an discussion before...

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From f89b5c965d0a49de3c1297bb5edd2dc061951b71 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyota....@gmail.com>
Date: Fri, 28 Feb 2020 15:52:58 +0900
Subject: [PATCH v4] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is happening.
Make this message less scary as "reached end of WAL".
---
 src/backend/access/transam/xlog.c       | 81 ++++++++++++++++++-------
 src/backend/access/transam/xlogreader.c | 11 ++++
 src/backend/replication/walreceiver.c   | 13 ++--
 src/include/access/xlogreader.h         |  1 +
 4 files changed, 79 insertions(+), 27 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 377afb8732..fbcb8d78b8 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4361,12 +4361,15 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 	for (;;)
 	{
 		char	   *errormsg;
+		XLogRecPtr	ErrRecPtr = InvalidXLogRecPtr;
 
 		record = XLogReadRecord(xlogreader, &errormsg);
 		ReadRecPtr = xlogreader->ReadRecPtr;
 		EndRecPtr = xlogreader->EndRecPtr;
 		if (record == NULL)
 		{
+			ErrRecPtr = ReadRecPtr ? ReadRecPtr : EndRecPtr;
+
 			if (readFile >= 0)
 			{
 				close(readFile);
@@ -4374,13 +4377,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			}
 
 			/*
-			 * We only end up here without a message when XLogPageRead()
-			 * failed - in that case we already logged something. In
-			 * StandbyMode that only happens if we have been triggered, so we
-			 * shouldn't loop anymore in that case.
+			 * If we met other than end-of-wal, emit the error message right
+			 * now. Otherwise the message if any is shown as a part of the
+			 * end-of-WAL message below.
 			 */
-			if (errormsg)
-				ereport(emode_for_corrupt_record(emode, EndRecPtr),
+			if (!xlogreader->EndOfWAL && errormsg)
+				ereport(emode_for_corrupt_record(emode, ErrRecPtr),
 						(errmsg_internal("%s", errormsg) /* already translated */ ));
 		}
 
@@ -4411,11 +4413,12 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			/* Great, got a record */
 			return record;
 		}
-		else
+
+		/* No valid record available from this source */
+		lastSourceFailed = true;
+
+		if (!fetching_ckpt)
 		{
-			/* No valid record available from this source */
-			lastSourceFailed = true;
-
 			/*
 			 * If archive recovery was requested, but we were still doing
 			 * crash recovery, switch to archive recovery and retry using the
@@ -4428,11 +4431,17 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 			 * we'd have no idea how far we'd have to replay to reach
 			 * consistency.  So err on the safe side and give up.
 			 */
-			if (!InArchiveRecovery && ArchiveRecoveryRequested &&
-				!fetching_ckpt)
+			if (!InArchiveRecovery && ArchiveRecoveryRequested)
 			{
+				/*
+				 * We don't report this as LOG, since we don't stop recovery
+				 * here
+				 */
 				ereport(DEBUG1,
-						(errmsg_internal("reached end of WAL in pg_wal, entering archive recovery")));
+						(errmsg_internal("reached end of WAL at %X/%X on timeline %u in %s during crash recovery, entering archive recovery",
+										 LSN_FORMAT_ARGS(ErrRecPtr),
+										 ThisTimeLineID,
+										 xlogSourceNames[currentSource])));
 				InArchiveRecovery = true;
 				if (StandbyModeRequested)
 					StandbyMode = true;
@@ -4480,12 +4489,33 @@ ReadRecord(XLogReaderState *xlogreader, int emode,
 				continue;
 			}
 
-			/* In standby mode, loop back to retry. Otherwise, give up. */
-			if (StandbyMode && !CheckForStandbyTrigger())
-				continue;
-			else
-				return NULL;
+			/*
+			 *  We reached the end of WAL, show the messages just once at the
+			 *  same LSN.
+			 */
+			if (emode_for_corrupt_record(LOG, ErrRecPtr) == LOG)
+			{
+				char *fmt;
+
+				if (StandbyMode)
+					fmt = gettext_noop("reached end of WAL at %X/%X on timeline %u in %s during standby mode");
+				else if (InArchiveRecovery)
+					fmt = gettext_noop("reached end of WAL at %X/%X on timeline %u in %s during archive recovery");
+				else
+					fmt = gettext_noop("reached end of WAL at %X/%X on timeline %u in %s during crash recovery");
+
+				ereport(LOG,
+						(errmsg(fmt, LSN_FORMAT_ARGS(EndRecPtr), ThisTimeLineID,
+								xlogSourceNames[currentSource]),
+						 (errormsg ? errdetail_internal("%s", errormsg) : 0)));
+			}
 		}
+
+		/* In standby mode, loop back to retry. Otherwise, give up. */
+		if (StandbyMode && !CheckForStandbyTrigger())
+			continue;
+		else
+			return NULL;
 	}
 }
 
@@ -7227,7 +7257,7 @@ StartupXLOG(void)
 		else
 		{
 			/* just have to read next record after CheckPoint */
-			record = ReadRecord(xlogreader, LOG, false);
+			record = ReadRecord(xlogreader, WARNING, false);
 		}
 
 		if (record != NULL)
@@ -7454,7 +7484,7 @@ StartupXLOG(void)
 				}
 
 				/* Else, try to fetch the next WAL record */
-				record = ReadRecord(xlogreader, LOG, false);
+				record = ReadRecord(xlogreader, WARNING, false);
 			} while (record != NULL);
 
 			/*
@@ -7514,13 +7544,20 @@ StartupXLOG(void)
 
 			InRedo = false;
 		}
-		else
+		else if (xlogreader->EndOfWAL)
 		{
 			/* there are no WAL records following the checkpoint */
 			ereport(LOG,
 					(errmsg("redo is not required")));
 
 		}
+		else
+		{
+			/* broken record found */
+			ereport(WARNING,
+					(errmsg("redo is skipped"),
+					 errhint("This suggests WAL file corruption. You might need to check the database.")));
+		}
 
 		/*
 		 * This check is intentionally after the above log messages that
@@ -12653,7 +12690,7 @@ emode_for_corrupt_record(int emode, XLogRecPtr RecPtr)
 {
 	static XLogRecPtr lastComplaint = 0;
 
-	if (readSource == XLOG_FROM_PG_WAL && emode == LOG)
+	if (readSource == XLOG_FROM_PG_WAL && emode <= WARNING)
 	{
 		if (RecPtr == lastComplaint)
 			emode = DEBUG1;
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 42738eb940..dacba32143 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -118,6 +118,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -288,6 +289,7 @@ XLogReadRecord(XLogReaderState *state, char **errormsg)
 	/* reset error state */
 	*errormsg = NULL;
 	state->errormsg_buf[0] = '\0';
+	state->EndOfWAL = false;
 
 	ResetDecoder(state);
 
@@ -689,6 +691,15 @@ ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
 					  XLogRecPtr PrevRecPtr, XLogRecord *record,
 					  bool randAccess)
 {
+	if (record->xl_tot_len == 0)
+	{
+		/* This is strictly not an invalid state, so phrase it as so. */
+		report_invalid_record(state,
+							  "record length is 0 at %X/%X",
+							  (uint32) (RecPtr >> 32), (uint32) RecPtr);
+		state->EndOfWAL = true;
+		return false;
+	}
 	if (record->xl_tot_len < SizeOfXLogRecord)
 	{
 		report_invalid_record(state,
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index e5f8a06fea..2377c58b4c 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -460,12 +460,15 @@ WalReceiverMain(void)
 						else if (len < 0)
 						{
 							ereport(LOG,
-									(errmsg("replication terminated by primary server"),
-									 errdetail("End of WAL reached on timeline %u at %X/%X.",
+									(errmsg("replication terminated by primary server on timeline %u at %X/%X.",
 											   startpointTLI,
-											   LSN_FORMAT_ARGS(LogstreamResult.Write))));
-							endofwal = true;
-							break;
+								LSN_FORMAT_ARGS(LogstreamResult.Write))));
+
+							/*
+							 * we have no longer anything to do on the broken
+							 * connection other than exiting.
+							 */
+							proc_exit(1);
 						}
 						len = walrcv_receive(wrconn, &buf, &wait_fd);
 					}
diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
index 21d200d3df..0491adfc5b 100644
--- a/src/include/access/xlogreader.h
+++ b/src/include/access/xlogreader.h
@@ -174,6 +174,7 @@ struct XLogReaderState
 	 */
 	XLogRecPtr	ReadRecPtr;		/* start of last record read */
 	XLogRecPtr	EndRecPtr;		/* end+1 of last record read */
+	bool		EndOfWAL;		/* the last attempt was EOW? */
 
 
 	/* ----------------------------------------
-- 
2.27.0

Reply via email to