On Wed, Feb 16, 2022 at 9:04 AM Ashutosh Sharma <ashu.coe...@gmail.com> wrote: > I don't think that's the use case of this patch. Unless there is some > other valid reason, I would suggest you remove it.
Removed the function pg_verify_raw_wal_record. Robert and Greg also voted for removal upthread. > > > Should we add a function that returns the pointer to the first and > > > probably the last WAL record in the WAL segment? This would help users > > > to inspect the wal records in the entire wal segment if they wish to > > > do so. > > > > Good point. One can do this already with pg_get_wal_records_info and > > pg_walfile_name_offset. Usually, the LSN format itself can give an > > idea about the WAL file it is in. > > > > postgres=# select lsn, pg_walfile_name_offset(lsn) from > > pg_get_wal_records_info('0/5000000', '0/5FFFFFF') order by lsn asc > > limit 1; > > lsn | pg_walfile_name_offset > > -----------+------------------------------- > > 0/5000038 | (000000010000000000000005,56) > > (1 row) > > > > postgres=# select lsn, pg_walfile_name_offset(lsn) from > > pg_get_wal_records_info('0/5000000', '0/5FFFFFF') order by lsn desc > > limit 1; > > lsn | pg_walfile_name_offset > > -----------+------------------------------------- > > 0/5FFFFC0 | (000000010000000000000005,16777152) > > (1 row) > > > > The workaround you are suggesting is not very user friendly and FYKI > pg_wal_records_info simply hangs at times when we specify the higher > and lower limit of lsn in a wal file. > > To make things easier for the end users I would suggest we add a > function that can return a valid first and last lsn in a walfile. The > output of this function can be used to inspect the wal records in the > entire wal file if they wish to do so and I am sure they will. So, it > should be something like this: > > select first_valid_lsn, last_valid_lsn from > pg_get_first_last_valid_wal_record('wal-segment-name'); > > And above function can directly be used with pg_get_wal_records_info() like > > select > pg_get_wal_records_info(pg_get_first_last_valid_wal_record('wal-segment')); > > I think this is a pretty basic ASK that we expect to be present in the > module like this. Added a new function that returns the first and last valid WAL record LSN of a given WAL file. > > > +PG_FUNCTION_INFO_V1(pg_get_raw_wal_record); > > > +PG_FUNCTION_INFO_V1(pg_get_first_valid_wal_record_lsn); > > > +PG_FUNCTION_INFO_V1(pg_verify_raw_wal_record); > > > +PG_FUNCTION_INFO_V1(pg_get_wal_record_info); > > > +PG_FUNCTION_INFO_V1(pg_get_wal_records_info); > > > > > > I think we should allow all these functions to be executed in wait and > > > *nowait* mode. If a user specifies nowait mode, the function should > > > return if no WAL data is present, rather than waiting for new WAL data > > > to become available, default behaviour could be anything you like. > > > > Currently, pg_walinspect uses read_local_xlog_page which waits in the > > while(1) loop if a future LSN is specified. As read_local_xlog_page is > > an implementation of XLogPageReadCB, which doesn't have a wait/nowait > > parameter, if we really need a wait/nowait mode behaviour, we need to > > do extra things(either add a backend-level global wait variable, set > > before XLogReadRecord, if set, read_local_xlog_page can just exit > > without waiting and reset after the XLogReadRecord or add an extra > > bool wait variable to XLogReaderState and use it in > > read_local_xlog_page). > > > > I am not asking to do any changes in the backend code. Please check - > how pg_waldump does this when a user requests to stop once the endptr > has reached. If not for all functions at least for a few functions we > can do this if it is doable. I've added a new function read_local_xlog_page_2 (similar to read_local_xlog_page but works in wait and no wait mode) and the callers can specify whether to wait or not wait using private_data. Actually, I wanted to use the private_data structure of read_local_xlog_page but the logical decoding already has context as private_data, that is why I had to have a new function. I know it creates a bit of duplicate code, but its cleaner than using backend-local variables or additional flags in XLogReaderState or adding wait/no-wait boolean to page_read callback. Any other suggestions are welcome here. With this, I'm able to have wait/no wait versions for any functions. But for now, I'm having wait/no wait for two functions (pg_get_wal_records_info and pg_get_wal_stats) for which it makes more sense. > > > +Datum > > > +pg_get_wal_records_info(PG_FUNCTION_ARGS) > > > +{ > > > +#define PG_GET_WAL_RECORDS_INFO_COLS 10 > > > > > > > > > We could probably have another variant of this function that would > > > work even if the end pointer is not specified, in which case the > > > default end pointer would be the last WAL record in the WAL segment. > > > Currently it mandates the use of an end pointer which slightly reduces > > > flexibility. > > > > Last WAL record in the WAL segment may not be of much use(one can > > figure out the last valid WAL record in a wal file as mentioned > > above), but the WAL records info till the latest current flush LSN of > > the server would be a useful functionality. But that too, can be found > > using something like "select lsn, prev_lsn, resource_manager from > > pg_get_wal_records_info('0/8099568', pg_current_wal_lsn());" > > > > What if a user wants to inspect all the valid wal records from a > startptr (startlsn) and he doesn't know the endptr? Why should he/she > be mandated to get the endptr and supply it to this function? I don't > think we should force users to do that. I think this is again a very > basic ASK that can be done in this version itself. It is not at all > any advanced thing that we can think of doing in the future. Agreed. Added new functions that emits wal records info/stats till the end of the WAL at the moment. > > > + > > > +/* > > > + * Get the first valid raw WAL record lsn. > > > + */ > > > +Datum > > > +pg_get_first_valid_wal_record_lsn(PG_FUNCTION_ARGS) > > > > > > > > > I think this function should return a pointer to the nearest valid WAL > > > record which can be the previous WAL record to the LSN entered by the > > > user or the next WAL record. If a user unknowingly enters an lsn that > > > does not exist then in such cases we should probably return the lsn of > > > the previous WAL record instead of hanging or waiting for the new WAL > > > record to arrive. > > > > Is it useful? > > It is useful in the same way as returning the next valid wal pointer > is. Why should a user wait for the next valid wal pointer to be > available instead the function should identify the previous valid wal > record and return it and put an appropriate message to the user. > > If there's a strong reason, how about naming > > pg_get_next_valid_wal_record_lsn returning the next valid wal record > > LSN and pg_get_previous_valid_wal_record_lsn returning the previous > > valid wal record LSN ? If you think having two functions is too much, > > then, how about pg_get_first_valid_wal_record_lsn returning both the > > next valid wal record LSN and its previous wal record LSN? > > > > The latter one looks better. Modified. Attaching v5 patch set, please review it further. Regards, Bharath Rupireddy.
From 42e9f32a8579505adc38366a12c8c6d63f18fb83 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Fri, 25 Feb 2022 10:16:43 +0000 Subject: [PATCH v5] pg_walinspect --- contrib/Makefile | 1 + contrib/pg_walinspect/.gitignore | 4 + contrib/pg_walinspect/Makefile | 26 + contrib/pg_walinspect/pg_walinspect--1.0.sql | 146 +++ contrib/pg_walinspect/pg_walinspect.c | 1227 ++++++++++++++++++ contrib/pg_walinspect/pg_walinspect.control | 5 + src/backend/access/transam/xlogreader.c | 14 +- src/backend/access/transam/xlogutils.c | 142 ++ src/bin/pg_waldump/pg_waldump.c | 5 + src/common/relpath.c | 18 + src/include/access/xlog.h | 2 +- src/include/access/xlog_internal.h | 2 +- src/include/access/xlogreader.h | 2 - src/include/access/xlogutils.h | 13 + src/include/common/relpath.h | 1 + 15 files changed, 1597 insertions(+), 11 deletions(-) create mode 100644 contrib/pg_walinspect/.gitignore create mode 100644 contrib/pg_walinspect/Makefile create mode 100644 contrib/pg_walinspect/pg_walinspect--1.0.sql create mode 100644 contrib/pg_walinspect/pg_walinspect.c create mode 100644 contrib/pg_walinspect/pg_walinspect.control diff --git a/contrib/Makefile b/contrib/Makefile index e3e221308b..705c6fc36b 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -40,6 +40,7 @@ SUBDIRS = \ pgrowlocks \ pgstattuple \ pg_visibility \ + pg_walinspect \ postgres_fdw \ seg \ spi \ diff --git a/contrib/pg_walinspect/.gitignore b/contrib/pg_walinspect/.gitignore new file mode 100644 index 0000000000..5dcb3ff972 --- /dev/null +++ b/contrib/pg_walinspect/.gitignore @@ -0,0 +1,4 @@ +# Generated subdirectories +/log/ +/results/ +/tmp_check/ diff --git a/contrib/pg_walinspect/Makefile b/contrib/pg_walinspect/Makefile new file mode 100644 index 0000000000..c92a97447f --- /dev/null +++ b/contrib/pg_walinspect/Makefile @@ -0,0 +1,26 @@ +# contrib/pg_walinspect/Makefile + +MODULE_big = pg_walinspect +OBJS = \ + $(WIN32RES) \ + pg_walinspect.o +PGFILEDESC = "pg_walinspect - functions to inspect contents of PostgreSQL Write-Ahead Log" + +PG_CPPFLAGS = -I$(libpq_srcdir) +SHLIB_LINK_INTERNAL = $(libpq) + +EXTENSION = pg_walinspect +DATA = pg_walinspect--1.0.sql + +REGRESS = pg_walinspect + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = contrib/pg_walinspect +top_builddir = ../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif diff --git a/contrib/pg_walinspect/pg_walinspect--1.0.sql b/contrib/pg_walinspect/pg_walinspect--1.0.sql new file mode 100644 index 0000000000..d1828841f0 --- /dev/null +++ b/contrib/pg_walinspect/pg_walinspect--1.0.sql @@ -0,0 +1,146 @@ +/* contrib/pg_walinspect/pg_walinspect--1.0.sql */ + +-- complain if script is sourced in psql, rather than via CREATE EXTENSION +\echo Use "CREATE EXTENSION pg_walinspect" to load this file. \quit + +-- +-- pg_get_raw_wal_record() +-- +CREATE FUNCTION pg_get_raw_wal_record(IN in_lsn pg_lsn, + OUT lsn pg_lsn, + OUT record bytea +) +AS 'MODULE_PATHNAME', 'pg_get_raw_wal_record' +LANGUAGE C STRICT PARALLEL SAFE; + +GRANT EXECUTE ON FUNCTION pg_get_raw_wal_record(pg_lsn) TO pg_monitor; + +-- +-- pg_get_first_valid_wal_record_lsn() +-- +CREATE FUNCTION pg_get_first_valid_wal_record_lsn(IN in_lsn pg_lsn, + OUT lsn pg_lsn, + OUT prev_lsn pg_lsn +) +AS 'MODULE_PATHNAME', 'pg_get_first_valid_wal_record_lsn' +LANGUAGE C STRICT PARALLEL SAFE; + +GRANT EXECUTE ON FUNCTION pg_get_first_valid_wal_record_lsn(pg_lsn) TO pg_monitor; + +-- +-- pg_get_first_and_last_valid_wal_record_lsn() +-- +CREATE FUNCTION pg_get_first_and_last_valid_wal_record_lsn(IN wal_file_name text, + OUT first_valid_lsn pg_lsn, + OUT last_valid_lsn pg_lsn +) +AS 'MODULE_PATHNAME', 'pg_get_first_and_last_valid_wal_record_lsn' +LANGUAGE C STRICT PARALLEL SAFE; + +GRANT EXECUTE ON FUNCTION pg_get_first_and_last_valid_wal_record_lsn(text) TO pg_monitor; + +-- +-- pg_get_wal_record_info() +-- +CREATE FUNCTION pg_get_wal_record_info(IN in_lsn pg_lsn, + OUT lsn pg_lsn, + OUT prev_lsn pg_lsn, + OUT xid xid, + OUT resource_manager text, + OUT length int4, + OUT total_length int4, + OUT description text, + OUT block_ref text, + OUT data bytea, + OUT data_len int4 +) +AS 'MODULE_PATHNAME', 'pg_get_wal_record_info' +LANGUAGE C STRICT PARALLEL SAFE; + +GRANT EXECUTE ON FUNCTION pg_get_wal_record_info(pg_lsn) TO pg_monitor; + +-- +-- pg_get_wal_records_info() +-- +CREATE FUNCTION pg_get_wal_records_info(IN start_lsn pg_lsn, + IN end_lsn pg_lsn, + IN wait_for_wal boolean DEFAULT false, + OUT lsn pg_lsn, + OUT prev_lsn pg_lsn, + OUT xid xid, + OUT resource_manager text, + OUT length int4, + OUT total_length int4, + OUT description text, + OUT block_ref text, + OUT data bytea, + OUT data_len int4 +) +RETURNS SETOF record +AS 'MODULE_PATHNAME', 'pg_get_wal_records_info' +LANGUAGE C STRICT PARALLEL SAFE; + +GRANT EXECUTE ON FUNCTION pg_get_wal_records_info(pg_lsn, pg_lsn, boolean) TO pg_monitor; + +-- +-- pg_get_wal_records_info_till_end_of_wal() +-- +CREATE FUNCTION pg_get_wal_records_info_till_end_of_wal(IN start_lsn pg_lsn, + OUT lsn pg_lsn, + OUT prev_lsn pg_lsn, + OUT xid xid, + OUT resource_manager text, + OUT length int4, + OUT total_length int4, + OUT description text, + OUT block_ref text, + OUT data bytea, + OUT data_len int4 +) +RETURNS SETOF record +AS 'MODULE_PATHNAME', 'pg_get_wal_records_info_till_end_of_wal' +LANGUAGE C STRICT PARALLEL SAFE; + +GRANT EXECUTE ON FUNCTION pg_get_wal_records_info_till_end_of_wal(pg_lsn) TO pg_monitor; + +-- +-- pg_get_wal_stats() +-- +CREATE FUNCTION pg_get_wal_stats(IN start_lsn pg_lsn, + IN end_lsn pg_lsn, + IN wait_for_wal boolean DEFAULT false, + OUT resource_manager text, + OUT count int8, + OUT count_percentage float4, + OUT record_size int8, + OUT record_size_percentage float4, + OUT fpi_size int8, + OUT fpi_size_percentage float4, + OUT combined_size int8, + OUT combined_size_percentage float4 +) +RETURNS SETOF record +AS 'MODULE_PATHNAME', 'pg_get_wal_stats' +LANGUAGE C STRICT PARALLEL SAFE; + +GRANT EXECUTE ON FUNCTION pg_get_wal_stats(pg_lsn, pg_lsn, boolean) TO pg_monitor; + +-- +-- pg_get_wal_stats_till_end_of_wal() +-- +CREATE FUNCTION pg_get_wal_stats_till_end_of_wal(IN start_lsn pg_lsn, + OUT resource_manager text, + OUT count int8, + OUT count_percentage float4, + OUT record_size int8, + OUT record_size_percentage float4, + OUT fpi_size int8, + OUT fpi_size_percentage float4, + OUT combined_size int8, + OUT combined_size_percentage float4 +) +RETURNS SETOF record +AS 'MODULE_PATHNAME', 'pg_get_wal_stats_till_end_of_wal' +LANGUAGE C STRICT PARALLEL SAFE; + +GRANT EXECUTE ON FUNCTION pg_get_wal_stats_till_end_of_wal(pg_lsn) TO pg_monitor; diff --git a/contrib/pg_walinspect/pg_walinspect.c b/contrib/pg_walinspect/pg_walinspect.c new file mode 100644 index 0000000000..02ea21c860 --- /dev/null +++ b/contrib/pg_walinspect/pg_walinspect.c @@ -0,0 +1,1227 @@ +/*------------------------------------------------------------------------- + * + * pg_walinspect.c + * Functions to inspect contents of PostgreSQL Write-Ahead Log + * + * Copyright (c) 2022, PostgreSQL Global Development Group + * + * IDENTIFICATION + * contrib/pg_walinspect/pg_walinspect.c + * + *------------------------------------------------------------------------- + */ +#include "postgres.h" + +#include "access/xlog.h" +#include "access/xlog_internal.h" +#include "access/xlogreader.h" +#include "access/xlogutils.h" +#include "funcapi.h" +#include "miscadmin.h" +#include "utils/builtins.h" +#include "utils/pg_lsn.h" + +/* + * NOTE: For any code change or issue fix here, it is highly recommended to + * give a thought about doing the same in pg_waldump tool as well. + */ + +PG_MODULE_MAGIC; + +#define MAX_XLINFO_TYPES 16 + +#define IsEndOfWALReached(state) \ + (state->private_data != NULL && \ + (((ReadLocalXLOGPage2Private *) xlogreader->private_data)->no_wait == true) && \ + (((ReadLocalXLOGPage2Private *) xlogreader->private_data)->reached_end_of_wal == true)) + +typedef struct Stats +{ + uint64 count; + uint64 rec_len; + uint64 fpi_len; +} Stats; + +typedef struct XLogRecStats +{ + uint64 count; + Stats rmgr_stats[RM_NEXT_ID]; + Stats record_stats[RM_NEXT_ID][MAX_XLINFO_TYPES]; +} XLogRecStats; + +extern void _PG_init(void); +extern void _PG_fini(void); + +PG_FUNCTION_INFO_V1(pg_get_raw_wal_record); +PG_FUNCTION_INFO_V1(pg_get_first_valid_wal_record_lsn); +PG_FUNCTION_INFO_V1(pg_get_wal_record_info); +PG_FUNCTION_INFO_V1(pg_get_wal_records_info); +PG_FUNCTION_INFO_V1(pg_get_wal_records_info_till_end_of_wal); +PG_FUNCTION_INFO_V1(pg_get_wal_stats); +PG_FUNCTION_INFO_V1(pg_get_wal_stats_till_end_of_wal); +PG_FUNCTION_INFO_V1(pg_get_first_and_last_valid_wal_record_lsn); + +static XLogReaderState *InitXLogReaderState(XLogRecPtr lsn, + XLogRecPtr *first_record, + bool warning, + bool wait_for_wal); +static XLogRecord *ReadNextXLogRecord(XLogReaderState *xlogreader, + XLogRecPtr first_record); +static void GetXLogRecordInfo(XLogReaderState *record, XLogRecPtr lsn, + Datum *values, bool *nulls, + uint32 ncols); +static void StoreXLogRecordStats(XLogRecStats * stats, + XLogReaderState *record); +static void GetXLogSummaryStats(XLogRecStats * stats, + Tuplestorestate *tupstore, TupleDesc tupdesc, + Datum *values, bool *nulls, + uint32 ncols); +static void FillXLogStatsRow(const char *name, uint64 n, uint64 total_count, + uint64 rec_len, uint64 total_rec_len, + uint64 fpi_len, uint64 total_fpi_len, + uint64 tot_len, uint64 total_len, + Datum *values, bool *nulls, uint32 ncols); +static void GetWalStatsInternal(FunctionCallInfo fcinfo, XLogRecPtr start_lsn, + XLogRecPtr end_lsn, bool wait_for_wal); +static void GetWALRecordsInfoInternal(FunctionCallInfo fcinfo, + XLogRecPtr start_lsn, XLogRecPtr end_lsn, + bool wait_for_wal); + +/* + * Module load callback. + */ +void +_PG_init(void) +{ + /* Define custom GUCs and install hooks here, if any. */ + + /* + * Have EmitWarningsOnPlaceholders("pg_walinspect"); if custom GUCs are + * defined. + */ + +} + +/* + * Module unload callback. + */ +void +_PG_fini(void) +{ + /* Uninstall hooks, if any. */ +} + +/* + * Intialize WAL reader and identify first valid LSN. + */ +static XLogReaderState * +InitXLogReaderState(XLogRecPtr lsn, XLogRecPtr *first_record, + bool warning, bool wait_for_wal) +{ + XLogReaderState *xlogreader; + ReadLocalXLOGPage2Private *private_data = NULL; + + /* + * Reading WAL below the first page of the first sgements isn't allowed. + * This is a bootstrap WAL page and the page_read callback fails to read + * it. + */ + if (lsn < XLOG_BLCKSZ) + ereport(ERROR, + (errmsg("could not read WAL at %X/%X", + LSN_FORMAT_ARGS(lsn)))); + + /* + * When told let's not wait for WAL to be available. Fill in the + * private_data so that the page_read callback can return whenever + * end-of-WAL is reached. + */ + if (wait_for_wal == false) + { + private_data = (ReadLocalXLOGPage2Private *) + palloc0(sizeof(ReadLocalXLOGPage2Private)); + + private_data->no_wait = true; + + /* + * page_read callback will set this to true when the end-of-WAL is + * reached. + */ + private_data->reached_end_of_wal = false; + } + + xlogreader = XLogReaderAllocate(wal_segment_size, NULL, + XL_ROUTINE(.page_read = &read_local_xlog_page_2, + .segment_open = &wal_segment_open, + .segment_close = &wal_segment_close), + (void *) private_data); + + if (xlogreader == NULL) + { + if (wait_for_wal == false) + pfree(private_data); + + ereport(ERROR, + (errcode(ERRCODE_OUT_OF_MEMORY), + errmsg("out of memory"), + errdetail("Failed while allocating a WAL reading processor."))); + } + + /* First find a valid recptr to start from. */ + *first_record = XLogFindNextRecord(xlogreader, lsn); + + if (XLogRecPtrIsInvalid(*first_record) && + IsEndOfWALReached(xlogreader)) + { + ereport(WARNING, + (errmsg("reached end of WAL"))); + + return xlogreader; + } + else if (XLogRecPtrIsInvalid(*first_record)) + { + ereport(ERROR, + (errmsg("could not find a valid record after %X/%X", + LSN_FORMAT_ARGS(lsn)))); + } + + /* + * Display a message that we're skipping data if the given LSN wasn't a + * pointer to the start of a record and also wasn't a pointer to the + * beginning of a segment (e.g. we were used in file mode). + */ + if (warning && *first_record != lsn && + XLogSegmentOffset(lsn, wal_segment_size) != 0) + { + ereport(WARNING, + (errmsg_plural("first record is after %X/%X, at %X/%X, skipping over %u byte", + "first record is after %X/%X, at %X/%X, skipping over %u bytes", + (*first_record - lsn), + LSN_FORMAT_ARGS(lsn), + LSN_FORMAT_ARGS(*first_record), + (uint32) (*first_record - lsn)))); + } + + return xlogreader; +} + +/* + * Read next WAL record. + */ +static XLogRecord * +ReadNextXLogRecord(XLogReaderState *xlogreader, XLogRecPtr first_record) +{ + char *errormsg; + XLogRecord *record; + + record = XLogReadRecord(xlogreader, &errormsg); + + if (record == NULL && + IsEndOfWALReached(xlogreader)) + { + ereport(WARNING, + (errmsg("reached end of WAL"))); + + return record; + } + else if (record == NULL) + { + if (errormsg) + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not read WAL at %X/%X: %s", + LSN_FORMAT_ARGS(first_record), errormsg))); + else + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not read WAL at %X/%X", + LSN_FORMAT_ARGS(first_record)))); + } + + return record; +} + +/* + * Get raw WAL record. + * + * XXX: Currently, this function will not wait if the future WAL LSN is + * provided. We can have a wait version of it by using + * ReadLocalXLOGPage2Private. + */ +Datum +pg_get_raw_wal_record(PG_FUNCTION_ARGS) +{ +#define PG_GET_RAW_WAL_RECORD_COLS 2 + XLogRecPtr lsn; + XLogRecord *record; + XLogRecPtr first_record; + XLogReaderState *xlogreader; + bytea *raw_record; + uint32 rec_len; + char *raw_record_data; + TupleDesc tupdesc; + Datum result; + HeapTuple tuple; + Datum values[PG_GET_RAW_WAL_RECORD_COLS]; + bool nulls[PG_GET_RAW_WAL_RECORD_COLS]; + int i = 0; + + lsn = PG_GETARG_LSN(0); + + /* Validate input. */ + if (XLogRecPtrIsInvalid(lsn)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid WAL record LSN"))); + + /* Build a tuple descriptor for our result type. */ + if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE) + elog(ERROR, "return type must be a row type"); + + xlogreader = InitXLogReaderState(lsn, &first_record, true, false); + + Assert(xlogreader); + + if (XLogRecPtrIsInvalid(first_record) && + IsEndOfWALReached(xlogreader)) + { + pfree(xlogreader->private_data); + XLogReaderFree(xlogreader); + + PG_RETURN_NULL(); + } + + record = ReadNextXLogRecord(xlogreader, first_record); + + if (record == NULL && + IsEndOfWALReached(xlogreader)) + { + pfree(xlogreader->private_data); + XLogReaderFree(xlogreader); + + PG_RETURN_NULL(); + } + + rec_len = XLogRecGetTotalLen(xlogreader); + + Assert(rec_len > 0); + + raw_record = (bytea *) palloc(rec_len + VARHDRSZ); + SET_VARSIZE(raw_record, rec_len + VARHDRSZ); + raw_record_data = VARDATA(raw_record); + + memcpy(raw_record_data, record, rec_len); + + if (xlogreader->private_data) + pfree(xlogreader->private_data); + + XLogReaderFree(xlogreader); + + MemSet(values, 0, sizeof(values)); + MemSet(nulls, 0, sizeof(nulls)); + + values[i++] = LSNGetDatum(first_record); + values[i++] = PointerGetDatum(raw_record); + + Assert(i == PG_GET_RAW_WAL_RECORD_COLS); + + tuple = heap_form_tuple(tupdesc, values, nulls); + result = HeapTupleGetDatum(tuple); + + PG_RETURN_DATUM(result); +#undef PG_GET_RAW_WAL_RECORD_COLS +} + +/* + * Get first valid raw WAL record LSN following the given LSN. + * + * XXX: Currently, this function will not wait if the future WAL LSN is + * provided. We can have a wait version of it by using + * ReadLocalXLOGPage2Private. + */ +Datum +pg_get_first_valid_wal_record_lsn(PG_FUNCTION_ARGS) +{ +#define PG_GET_FIRST_VALID_WAL_RECORD_LSN_COLS 2 + XLogRecPtr lsn; + XLogRecord *record; + XLogRecPtr first_record; + XLogRecPtr prev_record; + XLogReaderState *xlogreader; + TupleDesc tupdesc; + Datum result; + HeapTuple tuple; + Datum values[PG_GET_FIRST_VALID_WAL_RECORD_LSN_COLS]; + bool nulls[PG_GET_FIRST_VALID_WAL_RECORD_LSN_COLS]; + int i = 0; + + lsn = PG_GETARG_LSN(0); + + /* Validate input. */ + if (XLogRecPtrIsInvalid(lsn)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid WAL record LSN"))); + + /* Build a tuple descriptor for our result type. */ + if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE) + elog(ERROR, "return type must be a row type"); + + xlogreader = InitXLogReaderState(lsn, &first_record, true, false); + + Assert(xlogreader); + + if (XLogRecPtrIsInvalid(first_record) && + IsEndOfWALReached(xlogreader)) + { + pfree(xlogreader->private_data); + XLogReaderFree(xlogreader); + + PG_RETURN_NULL(); + } + + record = ReadNextXLogRecord(xlogreader, first_record); + + if (record == NULL && + IsEndOfWALReached(xlogreader)) + { + pfree(xlogreader->private_data); + XLogReaderFree(xlogreader); + + PG_RETURN_NULL(); + } + + prev_record = XLogRecGetPrev(xlogreader); + + if (xlogreader->private_data) + pfree(xlogreader->private_data); + + XLogReaderFree(xlogreader); + + /* + * Previous valid WAL record must be at an LSN lower than next valid WAL + * record LSN. Otherwise, it is an indication of something wrong, so error + * out. + */ + if (prev_record >= first_record) + ereport(ERROR, + (errcode(ERRCODE_DATA_CORRUPTED), + errmsg("record with incorrect prev-link %X/%X at %X/%X", + LSN_FORMAT_ARGS(prev_record), + LSN_FORMAT_ARGS(first_record)))); + + MemSet(values, 0, sizeof(values)); + MemSet(nulls, 0, sizeof(nulls)); + + values[i++] = LSNGetDatum(first_record); + values[i++] = LSNGetDatum(prev_record); + + Assert(i == PG_GET_FIRST_VALID_WAL_RECORD_LSN_COLS); + + tuple = heap_form_tuple(tupdesc, values, nulls); + result = HeapTupleGetDatum(tuple); + + PG_RETURN_DATUM(result); +#undef PG_GET_FIRST_VALID_WAL_RECORD_LSN_COLS +} + +/* + * Get first and last valid record LSN in a WAL file. + */ +Datum +pg_get_first_and_last_valid_wal_record_lsn(PG_FUNCTION_ARGS) +{ +#define PG_GET_FIRST_AND_LAST_VALID_WAL_RECORD_LSN_COLS 2 + char *fname; + uint32 tli; + XLogSegNo segno; + XLogRecPtr lsn; + XLogRecPtr first_record; + XLogRecPtr last_record = InvalidXLogRecPtr; + XLogReaderState *xlogreader; + TupleDesc tupdesc; + Datum result; + HeapTuple tuple; + Datum values[PG_GET_FIRST_AND_LAST_VALID_WAL_RECORD_LSN_COLS]; + bool nulls[PG_GET_FIRST_AND_LAST_VALID_WAL_RECORD_LSN_COLS]; + int i = 0; + + fname = text_to_cstring(PG_GETARG_TEXT_PP(0)); + + /* Validate input. */ + if (!IsXLogFileName(fname)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid WAL file name"))); + + /* Looks like an xlog file. Parse its position. */ + XLogFromFileName(fname, &tli, &segno, wal_segment_size); + + XLogSegNoOffsetToRecPtr(segno, 0, wal_segment_size, lsn); + + /* Build a tuple descriptor for our result type. */ + if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE) + elog(ERROR, "return type must be a row type"); + + xlogreader = InitXLogReaderState(lsn, &first_record, true, false); + + Assert(xlogreader); + + if (XLogRecPtrIsInvalid(first_record) && + IsEndOfWALReached(xlogreader)) + { + pfree(xlogreader->private_data); + XLogReaderFree(xlogreader); + + PG_RETURN_NULL(); + } + + for (;;) + { + XLogRecord *record; + char cfname[MAXFNAMELEN]; + + record = ReadNextXLogRecord(xlogreader, first_record); + + /* Exit loop if end-of-WAL is reached. */ + if (record == NULL && + IsEndOfWALReached(xlogreader)) + break; + + XLByteToSeg(xlogreader->currRecPtr, segno, wal_segment_size); + XLogFileName(cfname, tli, segno, wal_segment_size); + + if (!IsXLogFileName(cfname)) + continue; + + if (strncmp(cfname, fname, XLOG_FNAME_LEN) == 0) + { + /* + * Given WAL file name and current record's WAL file name matches. + * The current record might be the last valid record. + */ + last_record = xlogreader->currRecPtr; + } + else + { + /* + * Given WAL file name and current record's WAL file name doesn't + * match, so exit the loop. last_record contains the last valid + * record in the given WAL file. + */ + break; + } + } + + if (xlogreader->private_data) + pfree(xlogreader->private_data); + + XLogReaderFree(xlogreader); + + MemSet(values, 0, sizeof(values)); + MemSet(nulls, 0, sizeof(nulls)); + + values[i++] = LSNGetDatum(first_record); + values[i++] = LSNGetDatum(last_record); + + Assert(i == PG_GET_FIRST_AND_LAST_VALID_WAL_RECORD_LSN_COLS); + + tuple = heap_form_tuple(tupdesc, values, nulls); + result = HeapTupleGetDatum(tuple); + + PG_RETURN_DATUM(result); +#undef PG_GET_FIRST_AND_LAST_VALID_WAL_RECORD_LSN_COLS +} + +/* + * Calculate size of a record, split into !FPI and FPI parts. + */ +static void +GetXLogRecordLen(XLogReaderState *record, uint32 *rec_len, uint32 *fpi_len) +{ + int block_id; + + /* + * Calculate the amount of FPI data in the record. + * + * XXX: We peek into xlogreader's private decoded backup blocks for the + * bimg_len indicating the length of FPI data. It doesn't seem worth it to + * add an accessor macro for this. + */ + *fpi_len = 0; + for (block_id = 0; block_id <= record->max_block_id; block_id++) + { + if (XLogRecHasBlockImage(record, block_id)) + *fpi_len += record->blocks[block_id].bimg_len; + } + + /* + * Calculate the length of the record as the total length - the length of + * all the block images. + */ + *rec_len = XLogRecGetTotalLen(record) - *fpi_len; +} + +/* + * Get WAL record info. + */ +static void +GetXLogRecordInfo(XLogReaderState *record, XLogRecPtr lsn, + Datum *values, bool *nulls, uint32 ncols) +{ + const char *id; + const RmgrData *desc; + uint32 rec_len; + uint32 fpi_len; + RelFileNode rnode; + ForkNumber forknum; + BlockNumber blk; + int block_id; + StringInfoData rec_desc; + StringInfoData rec_blk_ref; + StringInfoData temp; + bytea *data; + char *main_data; + uint32 main_data_len; + int i = 0; + + desc = &RmgrTable[XLogRecGetRmid(record)]; + + GetXLogRecordLen(record, &rec_len, &fpi_len); + + values[i++] = LSNGetDatum(lsn); + values[i++] = LSNGetDatum(XLogRecGetPrev(record)); + values[i++] = TransactionIdGetDatum(XLogRecGetXid(record)); + values[i++] = CStringGetTextDatum(desc->rm_name); + values[i++] = UInt32GetDatum(rec_len); + values[i++] = UInt32GetDatum(XLogRecGetTotalLen(record)); + + initStringInfo(&rec_desc); + + id = desc->rm_identify(XLogRecGetInfo(record)); + + if (id == NULL) + appendStringInfo(&rec_desc, "UNKNOWN (%x) ", XLogRecGetInfo(record) & ~XLR_INFO_MASK); + else + appendStringInfo(&rec_desc, "%s ", id); + + initStringInfo(&temp); + + desc->rm_desc(&temp, record); + appendStringInfo(&rec_desc, "%s", temp.data); + + values[i++] = CStringGetTextDatum(rec_desc.data); + + pfree(temp.data); + + initStringInfo(&rec_blk_ref); + + /* Block references (detailed format). */ + for (block_id = 0; block_id <= record->max_block_id; block_id++) + { + if (!XLogRecHasBlockRef(record, block_id)) + continue; + + XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blk); + + if (forknum != MAIN_FORKNUM) + { + appendStringInfo(&rec_blk_ref, + "blkref #%u: rel %u/%u/%u fork %s blk %u", + block_id, rnode.spcNode, rnode.dbNode, + rnode.relNode, get_forkname(forknum), blk); + } + else + { + appendStringInfo(&rec_blk_ref, + "blkref #%u: rel %u/%u/%u blk %u", + block_id, rnode.spcNode, rnode.dbNode, + rnode.relNode, blk); + } + + if (XLogRecHasBlockImage(record, block_id)) + { + uint8 bimg_info = record->blocks[block_id].bimg_info; + + if (BKPIMAGE_COMPRESSED(bimg_info)) + { + const char *method; + + if ((bimg_info & BKPIMAGE_COMPRESS_PGLZ) != 0) + method = "pglz"; + else if ((bimg_info & BKPIMAGE_COMPRESS_LZ4) != 0) + method = "lz4"; + else + method = "unknown"; + + appendStringInfo(&rec_blk_ref, " (FPW%s); hole: offset: %u, length: %u, " + "compression saved: %u, method: %s", + XLogRecBlockImageApply(record, block_id) ? + "" : " for WAL verification", + record->blocks[block_id].hole_offset, + record->blocks[block_id].hole_length, + BLCKSZ - + record->blocks[block_id].hole_length - + record->blocks[block_id].bimg_len, + method); + } + else + { + appendStringInfo(&rec_blk_ref, " (FPW%s); hole: offset: %u, length: %u", + XLogRecBlockImageApply(record, block_id) ? + "" : " for WAL verification", + record->blocks[block_id].hole_offset, + record->blocks[block_id].hole_length); + } + } + } + + values[i++] = CStringGetTextDatum(rec_blk_ref.data); + + main_data_len = XLogRecGetDataLen(record); + + data = (bytea *) palloc(main_data_len + VARHDRSZ); + SET_VARSIZE(data, main_data_len + VARHDRSZ); + main_data = VARDATA(data); + + memcpy(main_data, XLogRecGetData(record), main_data_len); + + values[i++] = PointerGetDatum(data); + values[i++] = UInt32GetDatum(main_data_len); + + Assert(i == ncols); +} + +/* + * Get the WAL record info and data. + * + * XXX: Currently, this function will not wait if the future WAL LSN is + * provided. We can have a wait version of it by using + * ReadLocalXLOGPage2Private. + */ +Datum +pg_get_wal_record_info(PG_FUNCTION_ARGS) +{ +#define PG_GET_WAL_RECORD_INFO_COLS 10 + XLogRecPtr lsn; + XLogRecord *record; + XLogRecPtr first_record; + XLogReaderState *xlogreader; + TupleDesc tupdesc; + Datum result; + HeapTuple tuple; + Datum values[PG_GET_WAL_RECORD_INFO_COLS]; + bool nulls[PG_GET_WAL_RECORD_INFO_COLS]; + + lsn = PG_GETARG_LSN(0); + + /* Validate input. */ + if (XLogRecPtrIsInvalid(lsn)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid WAL record LSN"))); + + /* Build a tuple descriptor for our result type. */ + if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE) + elog(ERROR, "return type must be a row type"); + + xlogreader = InitXLogReaderState(lsn, &first_record, true, false); + + Assert(xlogreader); + + if (XLogRecPtrIsInvalid(first_record) && + IsEndOfWALReached(xlogreader)) + { + pfree(xlogreader->private_data); + XLogReaderFree(xlogreader); + + PG_RETURN_NULL(); + } + + record = ReadNextXLogRecord(xlogreader, first_record); + + if (record == NULL && + IsEndOfWALReached(xlogreader)) + { + pfree(xlogreader->private_data); + XLogReaderFree(xlogreader); + + PG_RETURN_NULL(); + } + + MemSet(values, 0, sizeof(values)); + MemSet(nulls, 0, sizeof(nulls)); + + GetXLogRecordInfo(xlogreader, first_record, values, nulls, + PG_GET_WAL_RECORD_INFO_COLS); + + if (xlogreader->private_data) + pfree(xlogreader->private_data); + + XLogReaderFree(xlogreader); + + tuple = heap_form_tuple(tupdesc, values, nulls); + result = HeapTupleGetDatum(tuple); + + PG_RETURN_DATUM(result); +#undef PG_GET_WAL_RECORD_INFO_COLS +} + +/* + * Get info and data of all WAL records between start LSN and end LSN. + */ +static void +GetWALRecordsInfoInternal(FunctionCallInfo fcinfo, XLogRecPtr start_lsn, + XLogRecPtr end_lsn, bool wait_for_wal) +{ +#define PG_GET_WAL_RECORDS_INFO_COLS 10 + XLogRecPtr first_record; + XLogReaderState *xlogreader; + ReturnSetInfo *rsinfo; + TupleDesc tupdesc; + Tuplestorestate *tupstore; + MemoryContext per_query_ctx; + MemoryContext oldcontext; + Datum values[PG_GET_WAL_RECORDS_INFO_COLS]; + bool nulls[PG_GET_WAL_RECORDS_INFO_COLS]; + + rsinfo = (ReturnSetInfo *) fcinfo->resultinfo; + + /* Check to see if caller supports us returning a tuplestore. */ + if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo)) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("set-valued function called in context that cannot accept a set"))); + if (!(rsinfo->allowedModes & SFRM_Materialize)) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("materialize mode required, but it is not allowed in this context"))); + + /* Build a tuple descriptor for our result type. */ + if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE) + elog(ERROR, "return type must be a row type"); + + /* Build tuplestore to hold the result rows. */ + per_query_ctx = rsinfo->econtext->ecxt_per_query_memory; + oldcontext = MemoryContextSwitchTo(per_query_ctx); + + tupstore = tuplestore_begin_heap(true, false, work_mem); + rsinfo->returnMode = SFRM_Materialize; + rsinfo->setResult = tupstore; + rsinfo->setDesc = tupdesc; + + MemoryContextSwitchTo(oldcontext); + + xlogreader = InitXLogReaderState(start_lsn, &first_record, true, + wait_for_wal); + + Assert(xlogreader); + + if (wait_for_wal == false && + XLogRecPtrIsInvalid(first_record) && + IsEndOfWALReached(xlogreader)) + { + pfree(xlogreader->private_data); + XLogReaderFree(xlogreader); + tuplestore_end(tupstore); + + return; + } + + MemSet(values, 0, sizeof(values)); + MemSet(nulls, 0, sizeof(nulls)); + + for (;;) + { + XLogRecord *record; + + record = ReadNextXLogRecord(xlogreader, first_record); + + /* Exit loop if end-of-WAL is reached. */ + if (record == NULL && + IsEndOfWALReached(xlogreader)) + break; + + /* Exit loop if read up to end_lsn. */ + if (!XLogRecPtrIsInvalid(end_lsn) && + xlogreader->currRecPtr >= end_lsn) + break; + + GetXLogRecordInfo(xlogreader, xlogreader->currRecPtr, values, nulls, + PG_GET_WAL_RECORDS_INFO_COLS); + + tuplestore_putvalues(tupstore, tupdesc, values, nulls); + } + + if (xlogreader->private_data) + pfree(xlogreader->private_data); + + XLogReaderFree(xlogreader); + + /* Clean up and return the tuplestore. */ + tuplestore_donestoring(tupstore); +#undef PG_GET_WAL_RECORDS_INFO_COLS +} + +/* + * Get info and data of all WAL records between start LSN and end LSN. + * + * This function will wait if the future WAL LSN is provided and wait_for_wal + * is true. By default, it doesn't wait for the future WAL LSN, but emits a + * warning and returns the available info. + */ +Datum +pg_get_wal_records_info(PG_FUNCTION_ARGS) +{ + XLogRecPtr start_lsn; + XLogRecPtr end_lsn; + bool wait_for_wal; + + start_lsn = PG_GETARG_LSN(0); + end_lsn = PG_GETARG_LSN(1); + wait_for_wal = PG_GETARG_BOOL(2); + + /* Validate input. */ + if (XLogRecPtrIsInvalid(start_lsn)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid WAL record start LSN"))); + + if (XLogRecPtrIsInvalid(end_lsn)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid WAL record end LSN"))); + + if (start_lsn >= end_lsn) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("WAL record start LSN must be less than end LSN"))); + + GetWALRecordsInfoInternal(fcinfo, start_lsn, end_lsn, wait_for_wal); + + PG_RETURN_VOID(); +} + +/* + * Get info and data of all WAL records from start LSN till end-of-WAL. + * + * This function is similar to pg_get_wal_records_info except that it doesn't + * have end LSN as input and it doesn't wait for future WAL. + */ +Datum +pg_get_wal_records_info_till_end_of_wal(PG_FUNCTION_ARGS) +{ + XLogRecPtr start_lsn; + + start_lsn = PG_GETARG_LSN(0); + + /* Validate input. */ + if (XLogRecPtrIsInvalid(start_lsn)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid WAL record start LSN"))); + + GetWALRecordsInfoInternal(fcinfo, start_lsn, InvalidXLogRecPtr, + false); + + PG_RETURN_VOID(); +} + +/* + * Store per-rmgr and per-record statistics for a given record. + */ +static void +StoreXLogRecordStats(XLogRecStats * stats, XLogReaderState *record) +{ + RmgrId rmid; + uint8 recid; + uint32 rec_len; + uint32 fpi_len; + + stats->count++; + + rmid = XLogRecGetRmid(record); + + GetXLogRecordLen(record, &rec_len, &fpi_len); + + /* Update per-rmgr statistics. */ + stats->rmgr_stats[rmid].count++; + stats->rmgr_stats[rmid].rec_len += rec_len; + stats->rmgr_stats[rmid].fpi_len += fpi_len; + + /* + * Update per-record statistics, where the record is identified by a + * combination of the RmgrId and the four bits of the xl_info field that + * are the rmgr's domain (resulting in sixteen possible entries per + * RmgrId). + */ + recid = XLogRecGetInfo(record) >> 4; + + /* + * XACT records need to be handled differently. Those records use the + * first bit of those four bits for an optional flag variable and the + * following three bits for the opcode. We filter opcode out of xl_info + * and use it as the identifier of the record. + */ + if (rmid == RM_XACT_ID) + recid &= 0x07; + + stats->record_stats[rmid][recid].count++; + stats->record_stats[rmid][recid].rec_len += rec_len; + stats->record_stats[rmid][recid].fpi_len += fpi_len; +} + +/* + * Fill single row of record counts and sizes for an rmgr or record. + */ +static void +FillXLogStatsRow(const char *name, + uint64 n, uint64 total_count, + uint64 rec_len, uint64 total_rec_len, + uint64 fpi_len, uint64 total_fpi_len, + uint64 tot_len, uint64 total_len, + Datum *values, bool *nulls, uint32 ncols) +{ + double n_pct; + double rec_len_pct; + double fpi_len_pct; + double tot_len_pct; + int i = 0; + + n_pct = 0; + if (total_count != 0) + n_pct = 100 * (double) n / total_count; + + rec_len_pct = 0; + if (total_rec_len != 0) + rec_len_pct = 100 * (double) rec_len / total_rec_len; + + fpi_len_pct = 0; + if (total_fpi_len != 0) + fpi_len_pct = 100 * (double) fpi_len / total_fpi_len; + + tot_len_pct = 0; + if (total_len != 0) + tot_len_pct = 100 * (double) tot_len / total_len; + + values[i++] = CStringGetTextDatum(name); + values[i++] = Int64GetDatum(n); + values[i++] = Float4GetDatum(n_pct); + values[i++] = Int64GetDatum(rec_len); + values[i++] = Float4GetDatum(rec_len_pct); + values[i++] = Int64GetDatum(fpi_len); + values[i++] = Float4GetDatum(fpi_len_pct); + values[i++] = Int64GetDatum(tot_len); + values[i++] = Float4GetDatum(tot_len_pct); + + Assert(i == ncols); +} + +/* + * Get summary statistics about the records seen so far. + */ +static void +GetXLogSummaryStats(XLogRecStats * stats, Tuplestorestate *tupstore, + TupleDesc tupdesc, Datum *values, bool *nulls, + uint32 ncols) +{ + int ri; + uint64 total_count = 0; + uint64 total_rec_len = 0; + uint64 total_fpi_len = 0; + uint64 total_len = 0; + + /* + * Each row shows its percentages of the total, so make a first pass to + * calculate column totals. + */ + for (ri = 0; ri < RM_NEXT_ID; ri++) + { + total_count += stats->rmgr_stats[ri].count; + total_rec_len += stats->rmgr_stats[ri].rec_len; + total_fpi_len += stats->rmgr_stats[ri].fpi_len; + } + total_len = total_rec_len + total_fpi_len; + + for (ri = 0; ri < RM_NEXT_ID; ri++) + { + uint64 count; + uint64 rec_len; + uint64 fpi_len; + uint64 tot_len; + const RmgrData *desc = &RmgrTable[ri]; + + count = stats->rmgr_stats[ri].count; + rec_len = stats->rmgr_stats[ri].rec_len; + fpi_len = stats->rmgr_stats[ri].fpi_len; + tot_len = rec_len + fpi_len; + + FillXLogStatsRow(desc->rm_name, count, total_count, rec_len, + total_rec_len, fpi_len, total_fpi_len, tot_len, + total_len, values, nulls, ncols); + + tuplestore_putvalues(tupstore, tupdesc, values, nulls); + } +} + +/* + * Get WAL stats between start LSN and end LSN. + */ +static void +GetWalStatsInternal(FunctionCallInfo fcinfo, XLogRecPtr start_lsn, + XLogRecPtr end_lsn, bool wait_for_wal) +{ +#define PG_GET_WAL_STATS_COLS 9 + XLogRecPtr first_record; + XLogReaderState *xlogreader; + XLogRecStats stats; + ReturnSetInfo *rsinfo; + TupleDesc tupdesc; + Tuplestorestate *tupstore; + MemoryContext per_query_ctx; + MemoryContext oldcontext; + Datum values[PG_GET_WAL_STATS_COLS]; + bool nulls[PG_GET_WAL_STATS_COLS]; + + rsinfo = (ReturnSetInfo *) fcinfo->resultinfo; + + /* Check to see if caller supports us returning a tuplestore. */ + if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo)) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("set-valued function called in context that cannot accept a set"))); + if (!(rsinfo->allowedModes & SFRM_Materialize)) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("materialize mode required, but it is not allowed in this context"))); + + /* Build a tuple descriptor for our result type. */ + if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE) + elog(ERROR, "return type must be a row type"); + + /* Build tuplestore to hold the result rows. */ + per_query_ctx = rsinfo->econtext->ecxt_per_query_memory; + oldcontext = MemoryContextSwitchTo(per_query_ctx); + + tupstore = tuplestore_begin_heap(true, false, work_mem); + rsinfo->returnMode = SFRM_Materialize; + rsinfo->setResult = tupstore; + rsinfo->setDesc = tupdesc; + + MemoryContextSwitchTo(oldcontext); + + xlogreader = InitXLogReaderState(start_lsn, &first_record, true, + wait_for_wal); + + Assert(xlogreader); + + if (wait_for_wal == false && + XLogRecPtrIsInvalid(first_record) && + IsEndOfWALReached(xlogreader)) + { + pfree(xlogreader->private_data); + XLogReaderFree(xlogreader); + tuplestore_end(tupstore); + + return; + } + + MemSet(&stats, 0, sizeof(stats)); + + for (;;) + { + XLogRecord *record; + + record = ReadNextXLogRecord(xlogreader, first_record); + + /* Exit loop if end-of-WAL is reached. */ + if (record == NULL && + IsEndOfWALReached(xlogreader)) + break; + + /* Exit loop if read up to end_lsn. */ + if (!XLogRecPtrIsInvalid(end_lsn) && + xlogreader->currRecPtr >= end_lsn) + break; + + StoreXLogRecordStats(&stats, xlogreader); + } + + if (xlogreader->private_data) + pfree(xlogreader->private_data); + + XLogReaderFree(xlogreader); + + MemSet(values, 0, sizeof(values)); + MemSet(nulls, 0, sizeof(nulls)); + + GetXLogSummaryStats(&stats, tupstore, tupdesc, values, nulls, + PG_GET_WAL_STATS_COLS); + + /* Clean up and return the tuplestore. */ + tuplestore_donestoring(tupstore); +#undef PG_GET_WAL_STATS_COLS +} + +/* + * Get WAL stats between start LSN and end LSN. + * + * This function will wait if the future WAL LSN is provided and wait_for_wal + * is true. By default, it doesn't wait for the future WAL LSN, but emits a + * warning and returns the available info. + */ +Datum +pg_get_wal_stats(PG_FUNCTION_ARGS) +{ + XLogRecPtr start_lsn; + XLogRecPtr end_lsn; + bool wait_for_wal; + + start_lsn = PG_GETARG_LSN(0); + end_lsn = PG_GETARG_LSN(1); + wait_for_wal = PG_GETARG_BOOL(2); + + /* Validate input. */ + if (XLogRecPtrIsInvalid(start_lsn)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid WAL record start LSN"))); + + if (XLogRecPtrIsInvalid(end_lsn)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid WAL record end LSN"))); + + if (start_lsn >= end_lsn) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("WAL record start LSN must be less than end LSN"))); + + GetWalStatsInternal(fcinfo, start_lsn, end_lsn, wait_for_wal); + + PG_RETURN_VOID(); +} + +/* + * Get WAL stats from start LSN till end-of-WAL. + * + * This function is similar to pg_get_wal_stats except that it doesn't have end + * LSN as input and it doesn't wait for future WAL. + */ +Datum +pg_get_wal_stats_till_end_of_wal(PG_FUNCTION_ARGS) +{ + XLogRecPtr start_lsn; + + start_lsn = PG_GETARG_LSN(0); + + /* Validate input. */ + if (XLogRecPtrIsInvalid(start_lsn)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid WAL record start LSN"))); + + GetWalStatsInternal(fcinfo, start_lsn, InvalidXLogRecPtr, false); + + PG_RETURN_VOID(); +} diff --git a/contrib/pg_walinspect/pg_walinspect.control b/contrib/pg_walinspect/pg_walinspect.control new file mode 100644 index 0000000000..017e56a2bb --- /dev/null +++ b/contrib/pg_walinspect/pg_walinspect.control @@ -0,0 +1,5 @@ +# pg_walinspect extension +comment = 'functions to inspect contents of PostgreSQL Write-Ahead Log' +default_version = '1.0' +module_pathname = '$libdir/pg_walinspect' +relocatable = true diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index 35029cf97d..4d217cabbe 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -956,13 +956,6 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr, return true; } -#ifdef FRONTEND -/* - * Functions that are currently not needed in the backend, but are better - * implemented inside xlogreader.c because of the internal facilities available - * here. - */ - /* * Find the first record with an lsn >= RecPtr. * @@ -1080,6 +1073,13 @@ err: return InvalidXLogRecPtr; } +#ifdef FRONTEND +/* + * Functions that are currently not needed in the backend, but are better + * implemented inside xlogreader.c because of the internal facilities available + * here. + */ + #endif /* FRONTEND */ /* diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c index 54d5f20734..4c33933249 100644 --- a/src/backend/access/transam/xlogutils.c +++ b/src/backend/access/transam/xlogutils.c @@ -968,6 +968,148 @@ read_local_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr, return count; } +/* + * XLogReaderRoutine->page_read callback for reading local xlog files + * + * This function is same as read_local_xlog_page except that it works in both + * wait and no wait mode. The callers can specify about waiting in private_data + * of XLogReaderState. + */ +int +read_local_xlog_page_2(XLogReaderState *state, XLogRecPtr targetPagePtr, + int reqLen, XLogRecPtr targetRecPtr, char *cur_page) +{ + XLogRecPtr read_upto, + loc; + TimeLineID tli; + int count; + WALReadError errinfo; + TimeLineID currTLI; + + loc = targetPagePtr + reqLen; + + /* Loop waiting for xlog to be available if necessary */ + while (1) + { + /* + * Determine the limit of xlog we can currently read to, and what the + * most recent timeline is. + */ + if (!RecoveryInProgress()) + read_upto = GetFlushRecPtr(&currTLI); + else + read_upto = GetXLogReplayRecPtr(&currTLI); + tli = currTLI; + + /* + * Check which timeline to get the record from. + * + * We have to do it each time through the loop because if we're in + * recovery as a cascading standby, the current timeline might've + * become historical. We can't rely on RecoveryInProgress() because in + * a standby configuration like + * + * A => B => C + * + * if we're a logical decoding session on C, and B gets promoted, our + * timeline will change while we remain in recovery. + * + * We can't just keep reading from the old timeline as the last WAL + * archive in the timeline will get renamed to .partial by + * StartupXLOG(). + * + * If that happens after our caller determined the TLI but before + * we actually read the xlog page, we might still try to read from the + * old (now renamed) segment and fail. There's not much we can do + * about this, but it can only happen when we're a leaf of a cascading + * standby whose primary gets promoted while we're decoding, so a + * one-off ERROR isn't too bad. + */ + XLogReadDetermineTimeline(state, targetPagePtr, reqLen, tli); + + if (state->currTLI == currTLI) + { + + if (loc <= read_upto) + break; + + /* Let's not wait for WAL to be available if indicated */ + if (loc > read_upto && + state->private_data != NULL) + { + ReadLocalXLOGPage2Private *private; + + private = (ReadLocalXLOGPage2Private *) state->private_data; + + if (private->no_wait == true) + { + private->reached_end_of_wal = true; + break; + } + } + + CHECK_FOR_INTERRUPTS(); + pg_usleep(1000L); + } + else + { + /* + * We're on a historical timeline, so limit reading to the switch + * point where we moved to the next timeline. + * + * We don't need to GetFlushRecPtr or GetXLogReplayRecPtr. We know + * about the new timeline, so we must've received past the end of + * it. + */ + read_upto = state->currTLIValidUntil; + + /* + * Setting tli to our wanted record's TLI is slightly wrong; the + * page might begin on an older timeline if it contains a timeline + * switch, since its xlog segment will have been copied from the + * prior timeline. This is pretty harmless though, as nothing + * cares so long as the timeline doesn't go backwards. We should + * read the page header instead; FIXME someday. + */ + tli = state->currTLI; + + /* No need to wait on a historical timeline */ + break; + } + } + + if (targetPagePtr + XLOG_BLCKSZ <= read_upto) + { + /* + * more than one block available; read only that block, have caller + * come back if they need more. + */ + count = XLOG_BLCKSZ; + } + else if (targetPagePtr + reqLen > read_upto) + { + /* not enough data there */ + return -1; + } + else + { + /* enough bytes available to satisfy the request */ + count = read_upto - targetPagePtr; + } + + /* + * Even though we just determined how much of the page can be validly read + * as 'count', read the whole page anyway. It's guaranteed to be + * zero-padded up to the page boundary if it's incomplete. + */ + if (!WALRead(state, cur_page, targetPagePtr, XLOG_BLCKSZ, tli, + &errinfo)) + WALReadRaiseError(&errinfo); + + /* number of valid bytes in the buffer */ + return count; +} + /* * Backend-specific convenience code to handle read errors encountered by * WALRead(). diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index a6251e1a96..f8d61b56ab 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -26,6 +26,11 @@ #include "getopt_long.h" #include "rmgrdesc.h" +/* + * NOTE: For any code change or issue fix here, it is highly recommended to + * give a thought about doing the same in pg_walinspect contrib module as well. + */ + static const char *progname; static int WalSegSz; diff --git a/src/common/relpath.c b/src/common/relpath.c index 636c96efd3..e8e3c44eae 100644 --- a/src/common/relpath.c +++ b/src/common/relpath.c @@ -40,6 +40,24 @@ const char *const forkNames[] = { StaticAssertDecl(lengthof(forkNames) == (MAX_FORKNUM + 1), "array length mismatch"); +/* + * get_forkname - return fork name given fork number + * + * This function is defined with "extern PGDLLIMPORT ..." in the core here so + * that the loadable modules can access it. + */ +const char *const +get_forkname(ForkNumber num) +{ + /* + * As this function gets called by external modules, let's ensure that the + * fork number passed in is valid. + */ + Assert(num > InvalidForkNumber && num <= MAX_FORKNUM); + + return forkNames[num]; +} + /* * forkname_to_number - look up fork number by name * diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h index 4b45ac64db..f34f228563 100644 --- a/src/include/access/xlog.h +++ b/src/include/access/xlog.h @@ -31,7 +31,7 @@ extern XLogRecPtr XactLastRecEnd; extern PGDLLIMPORT XLogRecPtr XactLastCommitEnd; /* these variables are GUC parameters related to XLOG */ -extern int wal_segment_size; +extern PGDLLIMPORT int wal_segment_size; extern int min_wal_size_mb; extern int max_wal_size_mb; extern int wal_keep_size_mb; diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h index 849954a8e5..38fd51bafa 100644 --- a/src/include/access/xlog_internal.h +++ b/src/include/access/xlog_internal.h @@ -319,7 +319,7 @@ typedef struct RmgrData struct XLogRecordBuffer *buf); } RmgrData; -extern const RmgrData RmgrTable[]; +extern PGDLLIMPORT const RmgrData RmgrTable[]; /* * Exported to support xlog switching from checkpointer diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h index 477f0efe26..d62d6ce7f8 100644 --- a/src/include/access/xlogreader.h +++ b/src/include/access/xlogreader.h @@ -276,9 +276,7 @@ extern void XLogReaderFree(XLogReaderState *state); /* Position the XLogReader to given record */ extern void XLogBeginRead(XLogReaderState *state, XLogRecPtr RecPtr); -#ifdef FRONTEND extern XLogRecPtr XLogFindNextRecord(XLogReaderState *state, XLogRecPtr RecPtr); -#endif /* FRONTEND */ /* Read the next XLog record. Returns NULL on end-of-WAL or failure */ extern struct XLogRecord *XLogReadRecord(XLogReaderState *state, diff --git a/src/include/access/xlogutils.h b/src/include/access/xlogutils.h index 64708949db..dd28640f07 100644 --- a/src/include/access/xlogutils.h +++ b/src/include/access/xlogutils.h @@ -75,6 +75,16 @@ typedef enum * need to be replayed) */ } XLogRedoAction; +/* + * private_data of read_local_xlog_page, that is, XLogReaderRoutine->page_read + * callback for reading local xlog files. + */ +typedef struct ReadLocalXLOGPage2Private +{ + bool no_wait; + bool reached_end_of_wal; +} ReadLocalXLOGPage2Private; + extern XLogRedoAction XLogReadBufferForRedo(XLogReaderState *record, uint8 buffer_id, Buffer *buf); extern Buffer XLogInitBufferForRedo(XLogReaderState *record, uint8 block_id); @@ -92,6 +102,9 @@ extern void FreeFakeRelcacheEntry(Relation fakerel); extern int read_local_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr, int reqLen, XLogRecPtr targetRecPtr, char *cur_page); +extern int read_local_xlog_page_2(XLogReaderState *state, + XLogRecPtr targetPagePtr, int reqLen, + XLogRecPtr targetRecPtr, char *cur_page); extern void wal_segment_open(XLogReaderState *state, XLogSegNo nextSegNo, TimeLineID *tli_p); diff --git a/src/include/common/relpath.h b/src/include/common/relpath.h index a4b5dc853b..3743f2e505 100644 --- a/src/include/common/relpath.h +++ b/src/include/common/relpath.h @@ -57,6 +57,7 @@ typedef enum ForkNumber #define FORKNAMECHARS 4 /* max chars for a fork name */ extern const char *const forkNames[]; +extern PGDLLIMPORT const char *const get_forkname(ForkNumber num); extern ForkNumber forkname_to_number(const char *forkName); extern int forkname_chars(const char *str, ForkNumber *fork); -- 2.25.1
From 1d2c2fd51cf0ad8704782aab2f215dd8e37fcbc2 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Fri, 25 Feb 2022 06:49:42 +0000 Subject: [PATCH v5] pg_walinspect tests --- .../pg_walinspect/expected/pg_walinspect.out | 122 ++++++++++++++++++ contrib/pg_walinspect/sql/pg_walinspect.sql | 86 ++++++++++++ 2 files changed, 208 insertions(+) create mode 100644 contrib/pg_walinspect/expected/pg_walinspect.out create mode 100644 contrib/pg_walinspect/sql/pg_walinspect.sql diff --git a/contrib/pg_walinspect/expected/pg_walinspect.out b/contrib/pg_walinspect/expected/pg_walinspect.out new file mode 100644 index 0000000000..ba09c1caab --- /dev/null +++ b/contrib/pg_walinspect/expected/pg_walinspect.out @@ -0,0 +1,122 @@ +CREATE EXTENSION pg_walinspect; +SELECT pg_current_wal_lsn() AS wal_lsn1 \gset +CREATE TABLE sample_tbl(col1 int, col2 int); +INSERT INTO sample_tbl VALUES(1, 2); +SELECT pg_current_wal_lsn() AS wal_lsn2 \gset +INSERT INTO sample_tbl VALUES(2, 3); +-- =================================================================== +-- tests for input validation +-- =================================================================== +SELECT pg_get_raw_wal_record('0/0'); -- ERROR +ERROR: invalid WAL record LSN +SELECT pg_get_first_valid_wal_record_lsn('0/0'); -- ERROR +ERROR: invalid WAL record LSN +SELECT pg_get_wal_record_info('0/0'); -- ERROR +ERROR: invalid WAL record LSN +SELECT pg_get_wal_records_info('0/0', '0/0'); -- ERROR +ERROR: invalid WAL record start LSN +SELECT pg_get_wal_records_info(:'wal_lsn1', '0/0'); -- ERROR +ERROR: invalid WAL record end LSN +SELECT pg_get_wal_records_info(:'wal_lsn2', :'wal_lsn1'); -- start LSN >= end LSN, ERROR +ERROR: WAL record start LSN must be less than end LSN +SELECT pg_get_wal_stats('0/0', '0/0'); -- ERROR +ERROR: invalid WAL record start LSN +SELECT pg_get_wal_stats(:'wal_lsn1', '0/0'); -- ERROR +ERROR: invalid WAL record end LSN +SELECT pg_get_wal_stats(:'wal_lsn2', :'wal_lsn1'); -- start LSN >= end LSN, ERROR +ERROR: WAL record start LSN must be less than end LSN +-- =================================================================== +-- tests for all function executions +-- =================================================================== +SELECT COUNT(*) >= 0 AS ok FROM pg_get_first_valid_wal_record_lsn(:'wal_lsn1'); + ok +---- + t +(1 row) + +SELECT lsn AS valid_wal_lsn1 FROM pg_get_first_valid_wal_record_lsn(:'wal_lsn1') \gset +SELECT lsn AS valid_wal_lsn2 FROM pg_get_first_valid_wal_record_lsn(:'wal_lsn2') \gset +SELECT COUNT(*) >= 0 AS ok FROM pg_get_raw_wal_record(:'valid_wal_lsn1'); + ok +---- + t +(1 row) + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_record_info(:'valid_wal_lsn1'); + ok +---- + t +(1 row) + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_records_info(:'valid_wal_lsn1', :'valid_wal_lsn2'); + ok +---- + t +(1 row) + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_records_info_till_end_of_wal(:'valid_wal_lsn2'); +WARNING: reached end of WAL + ok +---- + t +(1 row) + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats(:'valid_wal_lsn1', :'valid_wal_lsn2'); + ok +---- + t +(1 row) + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'valid_wal_lsn2'); +WARNING: reached end of WAL + ok +---- + t +(1 row) + +SELECT file_name AS wal_file_name FROM pg_walfile_name_offset(:'valid_wal_lsn1') \gset +SELECT COUNT(*) = 1 AS ok FROM pg_switch_wal(); + ok +---- + t +(1 row) + +INSERT INTO sample_tbl VALUES(3, 4); +SELECT COUNT(*) >= 0 AS ok FROM pg_get_first_and_last_valid_wal_record_lsn(:'wal_file_name'); + ok +---- + t +(1 row) + +-- =================================================================== +-- tests for functions with future WAL +-- =================================================================== +SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_records_info(:'valid_wal_lsn1', 'FFFFFFFF/FFFFFFFF'); -- WARNING +WARNING: reached end of WAL + ok +---- + t +(1 row) + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats(:'valid_wal_lsn1', 'FFFFFFFF/FFFFFFFF'); -- WARNING +WARNING: reached end of WAL + ok +---- + t +(1 row) + +-- =================================================================== +-- tests for filtering out WAL records of a particular table +-- =================================================================== +SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset +SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_records_info(:'valid_wal_lsn1', :'valid_wal_lsn2') + WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap'; + ok +---- + t +(1 row) + +-- =================================================================== +-- clean up +-- =================================================================== +DROP TABLE sample_tbl; diff --git a/contrib/pg_walinspect/sql/pg_walinspect.sql b/contrib/pg_walinspect/sql/pg_walinspect.sql new file mode 100644 index 0000000000..5c4253787d --- /dev/null +++ b/contrib/pg_walinspect/sql/pg_walinspect.sql @@ -0,0 +1,86 @@ +CREATE EXTENSION pg_walinspect; + +SELECT pg_current_wal_lsn() AS wal_lsn1 \gset + +CREATE TABLE sample_tbl(col1 int, col2 int); + +INSERT INTO sample_tbl VALUES(1, 2); + +SELECT pg_current_wal_lsn() AS wal_lsn2 \gset + +INSERT INTO sample_tbl VALUES(2, 3); + +-- =================================================================== +-- tests for input validation +-- =================================================================== + +SELECT pg_get_raw_wal_record('0/0'); -- ERROR + +SELECT pg_get_first_valid_wal_record_lsn('0/0'); -- ERROR + +SELECT pg_get_wal_record_info('0/0'); -- ERROR + +SELECT pg_get_wal_records_info('0/0', '0/0'); -- ERROR + +SELECT pg_get_wal_records_info(:'wal_lsn1', '0/0'); -- ERROR + +SELECT pg_get_wal_records_info(:'wal_lsn2', :'wal_lsn1'); -- start LSN >= end LSN, ERROR + +SELECT pg_get_wal_stats('0/0', '0/0'); -- ERROR + +SELECT pg_get_wal_stats(:'wal_lsn1', '0/0'); -- ERROR + +SELECT pg_get_wal_stats(:'wal_lsn2', :'wal_lsn1'); -- start LSN >= end LSN, ERROR + +-- =================================================================== +-- tests for all function executions +-- =================================================================== + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_first_valid_wal_record_lsn(:'wal_lsn1'); + +SELECT lsn AS valid_wal_lsn1 FROM pg_get_first_valid_wal_record_lsn(:'wal_lsn1') \gset + +SELECT lsn AS valid_wal_lsn2 FROM pg_get_first_valid_wal_record_lsn(:'wal_lsn2') \gset + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_raw_wal_record(:'valid_wal_lsn1'); + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_record_info(:'valid_wal_lsn1'); + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_records_info(:'valid_wal_lsn1', :'valid_wal_lsn2'); + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_records_info_till_end_of_wal(:'valid_wal_lsn2'); + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats(:'valid_wal_lsn1', :'valid_wal_lsn2'); + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats_till_end_of_wal(:'valid_wal_lsn2'); + +SELECT file_name AS wal_file_name FROM pg_walfile_name_offset(:'valid_wal_lsn1') \gset + +SELECT COUNT(*) = 1 AS ok FROM pg_switch_wal(); + +INSERT INTO sample_tbl VALUES(3, 4); + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_first_and_last_valid_wal_record_lsn(:'wal_file_name'); + +-- =================================================================== +-- tests for functions with future WAL +-- =================================================================== + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_records_info(:'valid_wal_lsn1', 'FFFFFFFF/FFFFFFFF'); -- WARNING + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_stats(:'valid_wal_lsn1', 'FFFFFFFF/FFFFFFFF'); -- WARNING + +-- =================================================================== +-- tests for filtering out WAL records of a particular table +-- =================================================================== + +SELECT oid AS sample_tbl_oid FROM pg_class WHERE relname = 'sample_tbl' \gset + +SELECT COUNT(*) >= 0 AS ok FROM pg_get_wal_records_info(:'valid_wal_lsn1', :'valid_wal_lsn2') + WHERE block_ref LIKE concat('%', :'sample_tbl_oid', '%') AND resource_manager = 'Heap'; + +-- =================================================================== +-- clean up +-- =================================================================== + +DROP TABLE sample_tbl; -- 2.25.1
From d1f3c3a295b842fbf6a3a9a040cfbd33e52d7333 Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Date: Fri, 25 Feb 2022 07:30:49 +0000 Subject: [PATCH v5] pg_walinspect docs --- doc/src/sgml/contrib.sgml | 1 + doc/src/sgml/filelist.sgml | 1 + doc/src/sgml/pgwalinspect.sgml | 165 +++++++++++++++++++++++++++++++++ 3 files changed, 167 insertions(+) create mode 100644 doc/src/sgml/pgwalinspect.sgml diff --git a/doc/src/sgml/contrib.sgml b/doc/src/sgml/contrib.sgml index be9711c6f2..19614a42e1 100644 --- a/doc/src/sgml/contrib.sgml +++ b/doc/src/sgml/contrib.sgml @@ -130,6 +130,7 @@ CREATE EXTENSION <replaceable>module_name</replaceable>; &pgsurgery; &pgtrgm; &pgvisibility; + &pgwalinspect; &postgres-fdw; &seg; &sepgsql; diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml index 328cd1f378..a2e8fd4a08 100644 --- a/doc/src/sgml/filelist.sgml +++ b/doc/src/sgml/filelist.sgml @@ -146,6 +146,7 @@ <!ENTITY pgsurgery SYSTEM "pgsurgery.sgml"> <!ENTITY pgtrgm SYSTEM "pgtrgm.sgml"> <!ENTITY pgvisibility SYSTEM "pgvisibility.sgml"> +<!ENTITY pgwalinspect SYSTEM "pgwalinspect.sgml"> <!ENTITY postgres-fdw SYSTEM "postgres-fdw.sgml"> <!ENTITY seg SYSTEM "seg.sgml"> <!ENTITY contrib-spi SYSTEM "contrib-spi.sgml"> diff --git a/doc/src/sgml/pgwalinspect.sgml b/doc/src/sgml/pgwalinspect.sgml new file mode 100644 index 0000000000..fa9f1d9b6c --- /dev/null +++ b/doc/src/sgml/pgwalinspect.sgml @@ -0,0 +1,165 @@ +<!-- doc/src/sgml/pgwalinspect.sgml --> + +<sect1 id="pgwalinspect" xreflabel="pg_walinspect"> + <title>pg_walinspect</title> + + <indexterm zone="pgwalinspect"> + <primary>pg_walinspect</primary> + </indexterm> + + <para> + The <filename>pg_walinspect</filename> module provides functions that allow + you to inspect the contents of write-ahead log of <productname>PostgreSQL</productname> + database cluster at a low level, which is useful for debugging and analytical + purposes. + </para> + + <para> + By default, use of these functions is restricted to superusers and members of + the <literal>pg_monitor</literal> role. Access may be granted to others using + <command>GRANT</command>. + </para> + + <sect2> + <title>General Functions</title> + + <variablelist> + <varlistentry> + <term> + <function>pg_get_raw_wal_record(in_lsn pg_lsn, lsn OUT pg_lsn, record OUT bytea)</function> + </term> + + <listitem> + <para> + Gets raw WAL record data of a given LSN. Issues a warning if the given + LSN wasn't a pointer to the start of a record and also wasn't a pointer + to the beginning of a WAL segment file. This function will not wait if + the future WAL LSN is provided, instead emits a warning and returns an + empty row. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <function>pg_get_first_valid_wal_record_lsn(in_lsn pg_lsn, lsn OUT pg_lsn, prev_lsn OUT pg_lsn)</function> + </term> + + <listitem> + <para> + Gets first and previous valid WAL record LSNs of the given LSN. Issues + a warning if the given LSN wasn't a pointer to the start of a record and + also wasn't a pointer to the beginning of a WAL segment file. This + function will not wait if the future WAL LSN is provided, instead emits + a warning and returns an empty row. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <function>pg_get_first_and_last_valid_wal_record_lsn(wal_file_name text, first_valid_lsn OUT pg_lsn, last_valid_lsn OUT pg_lsn)</function> + </term> + + <listitem> + <para> + Gets first and last valid WAL record LSNs of the given WAL file. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <function>pg_get_wal_record_info(in_lsn pg_lsn, lsn OUT pg_lsn, prev_lsn OUT pg_lsn, xid OUT xid, resource_manager OUT text, length OUT int4, total_length OUT int4, description OUT text, block_ref OUT text, data OUT bytea, data_len OUT int4)</function> + </term> + + <listitem> + <para> + Gets WAL record information of the given LSN. Issues a warning if the + given LSN wasn't a pointer to the start of a record and also wasn't a + pointer to the beginning of a WAL segment file. This function will not + wait if the future WAL LSN is provided, instead emits a warning and + returns an empty row. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <function>pg_get_wal_records_info(start_lsn pg_lsn, end_lsn pg_lsn, wait_for_wal boolean DEFAULT false, lsn OUT pg_lsn, prev_lsn OUT pg_lsn, xid OUT xid, resource_manager OUT text, length OUT int4, total_length OUT int4, description OUT text, block_ref OUT text, data OUT bytea, data_len OUT int4)</function> + </term> + + <listitem> + <para> + Gets information of all the valid WAL records between + <replaceable>start_lsn</replaceable> and <replaceable>end_lsn</replaceable>. + Returns one row per each valid WAL record. Issues a warning if the given + <replaceable>start_lsn</replaceable> wasn't a pointer to the start of a + record and also wasn't a pointer to the beginning of a WAL segment file. + This function will wait if the future WAL LSN is provided when + <replaceable>wait_for_wal</replaceable> is passed as <literal>true</literal>. + If <replaceable>wait_for_wal</replaceable> is <literal>false</literal> + (default value), it will not wait for the future WAL, instead emits a + warning and returns rows for whatever available WAL records. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <function>pg_get_wal_records_info_till_end_of_wal(start_lsn pg_lsn, lsn OUT pg_lsn, prev_lsn OUT pg_lsn, xid OUT xid, resource_manager OUT text, length OUT int4, total_length OUT int4, description OUT text, block_ref OUT text, data OUT bytea, data_len OUT int4)</function> + </term> + + <listitem> + <para> + Gets information of all the valid WAL records from + <replaceable>start_lsn</replaceable> till end of WAL. This function is + similar to <literal>pg_get_wal_records_info</literal> except that it + doesn't have <replaceable>end_lsn</replaceable> as input and it doesn't + wait for the future WAL. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <function>pg_get_wal_stats(start_lsn pg_lsn, end_lsn pg_lsn, wait_for_wal boolean DEFAULT false, resource_manager OUT text, count OUT int8, count_percentage OUT float4, record_size OUT int8, record_size_percentage OUT float4, fpi_size OUT int8, fpi_size_percentage OUT float4, combined_size OUT int8, combined_size_percentage OUT float4)</function> + </term> + + <listitem> + <para> + Gets statistics of all the valid WAL records between + <replaceable>start_lsn</replaceable> and <replaceable>end_lsn</replaceable>. + Returns one row per each <replaceable>resource_manager</replaceable> + type. Issues a warning if the given <replaceable>start_lsn</replaceable> + wasn't a pointer to the start of a record and also wasn't a pointer to + the beginning of a WAL segment file. This function will wait if the + future WAL LSN is provided when <replaceable>wait_for_wal</replaceable> + is passed as <literal>true</literal>. If <replaceable>wait_for_wal</replaceable> + is <literal>false</literal> (default value), it will not wait for the + future WAL, instead emits a warning and returns rows for whatever + available WAL records. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <function>pg_get_wal_stats_till_end_of_wal(start_lsn pg_lsn, end_lsn pg_lsn, resource_manager OUT text, count OUT int8, count_percentage OUT float4, record_size OUT int8, record_size_percentage OUT float4, fpi_size OUT int8, fpi_size_percentage OUT float4, combined_size OUT int8, combined_size_percentage OUT float4)</function> + </term> + + <listitem> + <para> + Gets statistics of all the valid WAL records from + <replaceable>start_lsn</replaceable> till end of WAL. This function is + similar to <literal>pg_get_wal_stats</literal> except that it doesn't + have <replaceable>end_lsn</replaceable> as input and it doesn't wait for + the future WAL. + </para> + </listitem> + </varlistentry> + + </variablelist> + </sect2> + +</sect1> -- 2.25.1