Hi,

we had another case on irc today were a user saw Postgres doing crash
recovery due to an unclean shutdown and was very worried about a
"invalid record length at <LSN>: expected at least 24, got 0" message
and went on to reindex all databases. This is also a frequent hit on
Stack Overflow etc.

AFAICT this is a normal message in case the record length is 0 and we
have reached end of WAL. So I propose to treat a length of 0 as a
special case and emit a less-scary message. Up until 9.4 we did have a
message "record with zero length at <LSN>", but 2c03216d831 ("Revamp the
WAL record format") removed it. I propose to reinstate it, see attached
patch.

I guess it would be even nicer if we could hint here that we likely
reached end-of-WAL, but the helper function report_invalid_record() does
not take an errhint and I guess we are too deep into the WAL reader
machinery to check for an end-of-WAL condition at that spot.


Michael
>From 015a9bfb2bd57f37f7c9601e9014d7560b76c21d Mon Sep 17 00:00:00 2001
From: Michael Banck <mba...@debian.org>
Date: Sat, 30 Nov 2024 11:51:34 +0100
Subject: [PATCH v1] Re-introduce less scary message for possible end-of-WAL
 invalid record.

A lot of users are worried about messages like "invalid record length at
<LSN>: expected at least 24, got 0" even though for the case of "got 0"
this usually just means that we have reached the end of WAL. So make
that case less scary by re-introducing a "record with zero length at
<LSN>" message for it. This message was there up until version 9.4, but
got removed during a large WAL record format revamp in 2c03216d831.
---
 src/backend/access/transam/xlogreader.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 0c5e040a94..43ec295ee7 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -662,6 +662,14 @@ restart:
 	}
 	else
 	{
+		/* Record length is zero. */
+		if (total_len == 0)
+		{
+			report_invalid_record(state,
+								  "record with zero length at %X/%X",
+								  LSN_FORMAT_ARGS(RecPtr));
+			goto err;
+		}
 		/* There may be no next page if it's too small. */
 		if (total_len < SizeOfXLogRecord)
 		{
@@ -1128,6 +1136,13 @@ ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
 					  XLogRecPtr PrevRecPtr, XLogRecord *record,
 					  bool randAccess)
 {
+	if (record->xl_tot_len == 0)
+	{
+		report_invalid_record(state,
+							  "record with zero length at %X/%X",
+							  LSN_FORMAT_ARGS(RecPtr));
+		return false;
+	}
 	if (record->xl_tot_len < SizeOfXLogRecord)
 	{
 		report_invalid_record(state,
-- 
2.39.5

Reply via email to