Re: detailed error message of pg_waldump

Masahiko Sawada Mon, 05 Jul 2021 00:05:21 -0700

On Wed, Jun 16, 2021 at 5:36 PM Kyotaro Horiguchi
<[email protected]> wrote:
>
> Thanks!
>
> At Wed, 16 Jun 2021 16:52:11 +0900, Masahiko Sawada <[email protected]> 
> wrote in
> > On Fri, Jun 4, 2021 at 5:35 PM Kyotaro Horiguchi
> > <[email protected]> wrote:
> > >
> > > In a very common operation of accidentally specifying a recycled
> > > segment, pg_waldump often returns the following obscure message.
> > >
> > > $ pg_waldump 00000001000000000000002D
> > > pg_waldump: fatal: could not find a valid record after 0/2D000000
> > >
> > > The more detailed message is generated internally and we can use it.
> > > That looks like the following.
> > >
> > > $ pg_waldump 00000001000000000000002D
> > > pg_waldump: fatal: unexpected pageaddr 0/24000000 in log segment 
> > > 00000001000000000000002D, offset 0
> > >
> > > Is it work doing?
> >
> > Perhaps we need both? The current message describes where the error
> > happened and the message internally generated describes the details.
> > It seems to me that both are useful. For example, if we find an error
> > during XLogReadRecord(), we show both as follows:
> >
> >    if (errormsg)
> >        fatal_error("error in WAL record at %X/%X: %s",
> >                    LSN_FORMAT_ARGS(xlogreader_state->ReadRecPtr),
> >                    errormsg);
>
> Yeah, I thought that it might be a bit vervous and lengty but actually
> we have another place where doing that. One more point is whether we
> have a case where first_record is invalid but errormsg is NULL
> there. WALDumpReadPage immediately exits so we should always have a
> message in that case according to the comment in ReadRecord.
>
> > * We only end up here without a message when XLogPageRead()
> > * failed - in that case we already logged something. In
> > * StandbyMode that only happens if we have been triggered, so we
> > * shouldn't loop anymore in that case.
>
> So that can be an assertion.
>
> Now the messages looks like this.
>
> $ pg_waldump /home/horiguti/data/data_work/pg_wal/000000020000000000000010
> pg_waldump: fatal: could not find a valid record after 0/0: unexpected 
> pageaddr 0/9000000 in log segment 000000020000000000000010, offset 0
>


Thank you for updating the patch!

+ *
+ * The returned pointer (or *errormsg) points to an internal buffer that's
+ * valid until the next call to XLogFindNextRecord or XLogReadRecord.
  */

The comment of XLogReadRecord() also has a similar description. Should
we update it as well?

BTW is this patch registered to the current commitfest? I could not find it.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: detailed error message of pg_waldump

Reply via email to