Hi, we had another case on irc today were a user saw Postgres doing crash recovery due to an unclean shutdown and was very worried about a "invalid record length at <LSN>: expected at least 24, got 0" message and went on to reindex all databases. This is also a frequent hit on Stack Overflow etc.
AFAICT this is a normal message in case the record length is 0 and we have reached end of WAL. So I propose to treat a length of 0 as a special case and emit a less-scary message. Up until 9.4 we did have a message "record with zero length at <LSN>", but 2c03216d831 ("Revamp the WAL record format") removed it. I propose to reinstate it, see attached patch. I guess it would be even nicer if we could hint here that we likely reached end-of-WAL, but the helper function report_invalid_record() does not take an errhint and I guess we are too deep into the WAL reader machinery to check for an end-of-WAL condition at that spot. Michael
>From 015a9bfb2bd57f37f7c9601e9014d7560b76c21d Mon Sep 17 00:00:00 2001 From: Michael Banck <mba...@debian.org> Date: Sat, 30 Nov 2024 11:51:34 +0100 Subject: [PATCH v1] Re-introduce less scary message for possible end-of-WAL invalid record. A lot of users are worried about messages like "invalid record length at <LSN>: expected at least 24, got 0" even though for the case of "got 0" this usually just means that we have reached the end of WAL. So make that case less scary by re-introducing a "record with zero length at <LSN>" message for it. This message was there up until version 9.4, but got removed during a large WAL record format revamp in 2c03216d831. --- src/backend/access/transam/xlogreader.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index 0c5e040a94..43ec295ee7 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -662,6 +662,14 @@ restart: } else { + /* Record length is zero. */ + if (total_len == 0) + { + report_invalid_record(state, + "record with zero length at %X/%X", + LSN_FORMAT_ARGS(RecPtr)); + goto err; + } /* There may be no next page if it's too small. */ if (total_len < SizeOfXLogRecord) { @@ -1128,6 +1136,13 @@ ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr, XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess) { + if (record->xl_tot_len == 0) + { + report_invalid_record(state, + "record with zero length at %X/%X", + LSN_FORMAT_ARGS(RecPtr)); + return false; + } if (record->xl_tot_len < SizeOfXLogRecord) { report_invalid_record(state, -- 2.39.5