On Fri, Dec 12, 2025 at 11:58:54AM +0900, Michael Paquier wrote:
> FWIW, I have begun putting my hands on your patch, editing it at some
> degree.  I am not sure that I will be able to finish that today, but
> I'm working towards getting something done.

Well, I have been able to do enough progress to have something to
share, and I'm getting pretty happy about how things are shaping.  As
you will notice, I have edited quite a few things..  In details:
- Less fwrite() and fread(), more read_chunk() and write_chunk().  We
are exposing these APIs, let's use them.
- Comments, much more comments and documentation.
- The callbacks are renamed, to be more generic: "finish" for the
end-of-operation actions and to/from_serialized_data.
- The format of the extra data in the main pgstats file and the
secondary file was a bit strange.  Mainly, why adding the length to
the main file and not the secondary file?  I have extended that a
little bit:
-- Addition of a magic number in the main file, to provide an extra
layer of safety in the read callback, letting the callback know that
it needs to read some data.
-- The offset of the secondary file follows immeditely.
-- The secondary file includes at the offset a copy of the hash key,
the description length, and the description.
- Reorganization of the read/write flow for the callbacks in the
modules, tracking the offset at write more precisely.  The handling of
the empty descriptions becomes simpler than what you have proposed
previously.

This way, we can make sure that the main stats file is OK with the
magic number, and we have a sanity check in the secondary file based
on the hash key whose copy is in the main stats file.

Regarding the error state that could be sent to the "end" callback, I
think that you are right.  We are not gaining much with that as by
design we are already pretty loose on the write side, hoping for the
best, relying on the read side to enforce all sanity checks.  So a
status in the "from" callback sounds like a good enough balance.

At the end of the day, I'm feeling pretty much OK with the core
changes and the layer we have here.  The module changes need an extra
round of lookup (did as well some tests with corrupted and empty
secondary files to test the stability at reload), and I'm pretty tired
so I may have missed something there.  The patch needs to be split in
two parts: one for the backend changes, and one for the module itself.
The backend changes are feeling pretty good, the module changes feel
better.
--
Michael
From cafc8251e6f56e113ad4ad19ffb72fa9ad312def Mon Sep 17 00:00:00 2001
From: Michael Paquier <[email protected]>
Date: Fri, 12 Dec 2025 18:38:05 +0900
Subject: [PATCH v7] Allow cumulative statistics to serialize auxiliary data to
 disk.

Cumulative Statistics kinds can now write additional per-entry data to
the statistics file that doesn't fit in shared memory. This is useful
for statistics with variable-length auxiliary data.

Three new optional callbacks are added to PgStat_KindInfo:

* to_serialized_data: writes auxiliary data for an entry
* from_serialized_data: reads auxiliary data for an entry
* finish: performs actions after read/write/discard operations.

The finish callback is invoked after processing all entries of a kind,
allowing extensions to close file handles and clean up resources.

Tests are also added to test_custom_stats.pl

Discussion: https://postgr.es/m/caa5rz0s9sdou+z6veojchwk+kdetktatc-ky9fq9z6bjddu...@mail.gmail.com
---
 src/include/executor/executor.h               |   2 +-
 src/include/libpq/libpq-be-fe-helpers.h       |   2 +-
 src/include/utils/pgstat_internal.h           |  47 ++
 src/backend/libpq/pqformat.c                  |   3 +-
 src/backend/rewrite/rewriteHandler.c          |   2 +-
 src/backend/storage/file/fileset.c            |   3 +-
 src/backend/utils/activity/pgstat.c           |  50 ++-
 src/backend/utils/adt/numutils.c              |   2 +-
 src/backend/utils/cache/relcache.c            |   2 +-
 src/backend/utils/mb/conv.c                   |   4 +-
 src/backend/utils/mb/mbutils.c                |   6 +-
 src/backend/utils/misc/guc.c                  |   2 +-
 src/backend/utils/mmgr/alignedalloc.c         |   4 +-
 src/backend/utils/mmgr/portalmem.c            |   3 +-
 src/backend/utils/sort/tuplesortvariants.c    |   1 -
 .../test_custom_stats/t/001_custom_stats.pl   |  39 +-
 .../test_custom_var_stats--1.0.sql            |   7 +-
 .../test_custom_stats/test_custom_var_stats.c | 400 +++++++++++++++++-
 src/tools/pgindent/typedefs.list              |   1 +
 19 files changed, 542 insertions(+), 38 deletions(-)

diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index fa2b657fb2ff..7cd6a49309f0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -404,7 +404,7 @@ ExecEvalExpr(ExprState *state,
  * Like ExecEvalExpr(), but for cases where no return value is expected,
  * because the side-effects of expression evaluation are what's desired. This
  * is e.g. used for projection and aggregate transition computation.
-
+ *
  * Evaluate expression identified by "state" in the execution context
  * given by "econtext".
  *
diff --git a/src/include/libpq/libpq-be-fe-helpers.h b/src/include/libpq/libpq-be-fe-helpers.h
index 1c4a342090c3..d2f6b3b13484 100644
--- a/src/include/libpq/libpq-be-fe-helpers.h
+++ b/src/include/libpq/libpq-be-fe-helpers.h
@@ -69,7 +69,7 @@ libpqsrv_connect(const char *conninfo, uint32 wait_event_info)
 /*
  * Like libpqsrv_connect(), except that this is a wrapper for
  * PQconnectdbParams().
-  */
+ */
 static inline PGconn *
 libpqsrv_connect_params(const char *const *keywords,
 						const char *const *values,
diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index ca1ba6420ca1..031fd34ad788 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -63,6 +63,21 @@ typedef struct PgStat_HashKey
 								 * identifier. */
 } PgStat_HashKey;
 
+/*
+ * Tracks if the stats file is being read, written or discarded, used in
+ * combination with the finish callback.
+ *
+ * These states allow plugins that create extra statistics files
+ * to determine the current operation and perform any necessary
+ * file cleanup.
+ */
+typedef enum PgStat_StatsFileOp
+{
+	STATS_WRITE,
+	STATS_READ,
+	STATS_DISCARD,
+} PgStat_StatsFileOp;
+
 /*
  * PgStat_HashKey should not have any padding.  Checking that the structure
  * size matches with the sum of each field is a check simple enough to
@@ -303,6 +318,38 @@ typedef struct PgStat_KindInfo
 									   const PgStatShared_Common *header, NameData *name);
 	bool		(*from_serialized_name) (const NameData *name, PgStat_HashKey *key);
 
+	/*
+	 * For variable-numbered stats: read or write additional data related to a
+	 * given entry, in the stats file or optionally in a different file.
+	 * Optional.
+	 *
+	 * to_serialized_data: write extra data for an entry.
+	 *
+	 * from_serialized_data: read extra data for an entry.  Returns true on
+	 * success, false on read error.
+	 *
+	 * "statfile" is a pointer to the on-disk stats file, named
+	 * PGSTAT_STAT_PERMANENT_FILENAME.  "key" is the hash key of the entry
+	 * just written or read.  "header" is a pointer to the stats data.
+	 */
+	void		(*to_serialized_data) (const PgStat_HashKey *key,
+									   const PgStatShared_Common *header,
+									   FILE *statfile);
+	bool		(*from_serialized_data) (const PgStat_HashKey *key,
+										 const PgStatShared_Common *header,
+										 FILE *statfile);
+
+	/*
+	 * For fixed-numbered or variable-numbered statistics.
+	 *
+	 * Perform custom actions when done processing the on-disk stats file
+	 * after all the stats entries have been processed.  Optional.
+	 *
+	 * "status" tracks the operation done for the on-disk stats file (read,
+	 * write, discard).
+	 */
+	void		(*finish) (PgStat_StatsFileOp status);
+
 	/*
 	 * For fixed-numbered statistics: Initialize shared memory state.
 	 *
diff --git a/src/backend/libpq/pqformat.c b/src/backend/libpq/pqformat.c
index 1cc126772f7c..67bdd3d93d05 100644
--- a/src/backend/libpq/pqformat.c
+++ b/src/backend/libpq/pqformat.c
@@ -307,9 +307,8 @@ pq_endmessage(StringInfo buf)
  *
  * The data buffer is *not* freed, allowing to reuse the buffer with
  * pq_beginmessage_reuse.
- --------------------------------
+ * --------------------------------
  */
-
 void
 pq_endmessage_reuse(StringInfo buf)
 {
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index 688dcd09ca6e..0852322cc588 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -2620,7 +2620,7 @@ view_col_is_auto_updatable(RangeTblRef *rtr, TargetEntry *tle)
  * view_query_is_auto_updatable - test whether the specified view definition
  * represents an auto-updatable view. Returns NULL (if the view can be updated)
  * or a message string giving the reason that it cannot be.
-
+ *
  * The returned string has not been translated; if it is shown as an error
  * message, the caller should apply _() to translate it.
  *
diff --git a/src/backend/storage/file/fileset.c b/src/backend/storage/file/fileset.c
index 4d5ee353fd7a..2061aa44e773 100644
--- a/src/backend/storage/file/fileset.c
+++ b/src/backend/storage/file/fileset.c
@@ -114,7 +114,8 @@ FileSetCreate(FileSet *fileset, const char *name)
 }
 
 /*
- * Open a file that was created with FileSetCreate() */
+ * Open a file that was created with FileSetCreate()
+ */
 File
 FileSetOpen(FileSet *fileset, const char *name, int mode)
 {
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 8713c7a04834..647fe9b111a4 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -523,6 +523,7 @@ pgstat_discard_stats(void)
 
 	/* NB: this needs to be done even in single user mode */
 
+	/* First, cleanup the main stats file, PGSTAT_STAT_PERMANENT_FILENAME */
 	ret = unlink(PGSTAT_STAT_PERMANENT_FILENAME);
 	if (ret != 0)
 	{
@@ -544,6 +545,15 @@ pgstat_discard_stats(void)
 								 PGSTAT_STAT_PERMANENT_FILENAME)));
 	}
 
+	/* Finish callbacks, if required */
+	for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+	{
+		const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+
+		if (kind_info && kind_info->finish)
+			kind_info->finish(STATS_DISCARD);
+	}
+
 	/*
 	 * Reset stats contents. This will set reset timestamps of fixed-numbered
 	 * stats to the current time (no variable stats exist).
@@ -1702,6 +1712,10 @@ pgstat_write_statsfile(void)
 		pgstat_write_chunk(fpout,
 						   pgstat_get_entry_data(ps->key.kind, shstats),
 						   pgstat_get_entry_len(ps->key.kind));
+
+		/* Write more data for the entry, if required */
+		if (kind_info->to_serialized_data)
+			kind_info->to_serialized_data(&ps->key, shstats, fpout);
 	}
 	dshash_seq_term(&hstat);
 
@@ -1734,6 +1748,15 @@ pgstat_write_statsfile(void)
 		/* durable_rename already emitted log message */
 		unlink(tmpfile);
 	}
+
+	/* Now, allow stats kinds to finalize the data writes */
+	for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+	{
+		const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+
+		if (kind_info && kind_info->finish)
+			kind_info->finish(STATS_WRITE);
+	}
 }
 
 /* helper for pgstat_read_statsfile() */
@@ -1871,6 +1894,7 @@ pgstat_read_statsfile(void)
 					PgStat_HashKey key;
 					PgStatShared_HashEntry *p;
 					PgStatShared_Common *header;
+					const PgStat_KindInfo *kind_info = NULL;
 
 					CHECK_FOR_INTERRUPTS();
 
@@ -1891,7 +1915,8 @@ pgstat_read_statsfile(void)
 							goto error;
 						}
 
-						if (!pgstat_get_kind_info(key.kind))
+						kind_info = pgstat_get_kind_info(key.kind);
+						if (!kind_info)
 						{
 							elog(WARNING, "could not find information of kind for entry %u/%u/%" PRIu64 " of type %c",
 								 key.kind, key.dboid,
@@ -1902,7 +1927,6 @@ pgstat_read_statsfile(void)
 					else
 					{
 						/* stats entry identified by name on disk (e.g. slots) */
-						const PgStat_KindInfo *kind_info = NULL;
 						PgStat_Kind kind;
 						NameData	name;
 
@@ -1996,6 +2020,18 @@ pgstat_read_statsfile(void)
 						goto error;
 					}
 
+					/* read more data for the entry, if required */
+					if (kind_info->from_serialized_data)
+					{
+						if (!kind_info->from_serialized_data(&key, header, fpin))
+						{
+							elog(WARNING, "could not read extra stats for entry %u/%u/%" PRIu64 " of type %c",
+								 key.kind, key.dboid,
+								 key.objid, t);
+							goto error;
+						}
+					}
+
 					break;
 				}
 			case PGSTAT_FILE_ENTRY_END:
@@ -2019,11 +2055,21 @@ pgstat_read_statsfile(void)
 	}
 
 done:
+	/* First, cleanup the main stats file, PGSTAT_STAT_PERMANENT_FILENAME */
 	FreeFile(fpin);
 
 	elog(DEBUG2, "removing permanent stats file \"%s\"", statfile);
 	unlink(statfile);
 
+	/* Let each stats kind run its finish callback, if required */
+	for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
+	{
+		const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);
+
+		if (kind_info && kind_info->finish)
+			kind_info->finish(STATS_READ);
+	}
+
 	return;
 
 error:
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 3bf30774a0c9..254c5cf82e4b 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -113,7 +113,7 @@ static const int8 hexlookup[128] = {
  * pg_strtoint16() will throw ereport() upon bad input format or overflow;
  * while pg_strtoint16_safe() instead returns such complaints in *escontext,
  * if it's an ErrorSaveContext.
-*
+ *
  * NB: Accumulate input as an unsigned number, to deal with two's complement
  * representation of the most negative number, which can't be represented as a
  * signed positive number.
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index a4dc1cbe5aec..2d0cb7bcfd4a 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -5643,7 +5643,7 @@ RelationGetIdentityKeyBitmap(Relation relation)
  * This should be called only for an index that is known to have an associated
  * exclusion constraint or primary key/unique constraint using WITHOUT
  * OVERLAPS.
-
+ *
  * It returns arrays (palloc'd in caller's context) of the exclusion operator
  * OIDs, their underlying functions' OIDs, and their strategy numbers in the
  * index's opclasses.  We cache all this information since it requires a fair
diff --git a/src/backend/utils/mb/conv.c b/src/backend/utils/mb/conv.c
index 4a312ab429b6..d53e885b067e 100644
--- a/src/backend/utils/mb/conv.c
+++ b/src/backend/utils/mb/conv.c
@@ -484,7 +484,7 @@ pg_mb_radix_conv(const pg_mb_radix_tree *rt,
  * utf: input string in UTF8 encoding (need not be null-terminated)
  * len: length of input string (in bytes)
  * iso: pointer to the output area (must be large enough!)
-		  (output string will be null-terminated)
+ *		  (output string will be null-terminated)
  * map: conversion map for single characters
  * cmap: conversion map for combined characters
  *		  (optional, pass NULL if none)
@@ -694,7 +694,7 @@ UtfToLocal(const unsigned char *utf, int len,
  * iso: input string in local encoding (need not be null-terminated)
  * len: length of input string (in bytes)
  * utf: pointer to the output area (must be large enough!)
-		  (output string will be null-terminated)
+ *		  (output string will be null-terminated)
  * map: conversion map for single characters
  * cmap: conversion map for combined characters
  *		  (optional, pass NULL if none)
diff --git a/src/backend/utils/mb/mbutils.c b/src/backend/utils/mb/mbutils.c
index dbce0e61812b..d1701d69b16e 100644
--- a/src/backend/utils/mb/mbutils.c
+++ b/src/backend/utils/mb/mbutils.c
@@ -497,7 +497,8 @@ pg_do_encoding_conversion_buf(Oid proc,
  * Convert string to encoding encoding_name. The source
  * encoding is the DB encoding.
  *
- * BYTEA convert_to(TEXT string, NAME encoding_name) */
+ * BYTEA convert_to(TEXT string, NAME encoding_name)
+ */
 Datum
 pg_convert_to(PG_FUNCTION_ARGS)
 {
@@ -522,7 +523,8 @@ pg_convert_to(PG_FUNCTION_ARGS)
  * Convert string from encoding encoding_name. The destination
  * encoding is the DB encoding.
  *
- * TEXT convert_from(BYTEA string, NAME encoding_name) */
+ * TEXT convert_from(BYTEA string, NAME encoding_name)
+ */
 Datum
 pg_convert_from(PG_FUNCTION_ARGS)
 {
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index f7d63e04c046..935c235e1b39 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -5449,7 +5449,7 @@ ShowGUCOption(const struct config_generic *record, bool use_units)
  *		variable sourceline, integer
  *		variable source, integer
  *		variable scontext, integer
-*		variable srole, OID
+ *		variable srole, OID
  */
 static void
 write_one_nondefault_variable(FILE *fp, struct config_generic *gconf)
diff --git a/src/backend/utils/mmgr/alignedalloc.c b/src/backend/utils/mmgr/alignedalloc.c
index b1be74269149..daee3fc80a1c 100644
--- a/src/backend/utils/mmgr/alignedalloc.c
+++ b/src/backend/utils/mmgr/alignedalloc.c
@@ -23,8 +23,8 @@
 
 /*
  * AlignedAllocFree
-*		Frees allocated memory; memory is removed from its owning context.
-*/
+ *		Frees allocated memory; memory is removed from its owning context.
+ */
 void
 AlignedAllocFree(void *pointer)
 {
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 943da087c9f5..1f2a423f38a6 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -853,7 +853,8 @@ AtAbort_Portals(void)
 /*
  * Post-abort cleanup for portals.
  *
- * Delete all portals not held over from prior transactions.  */
+ * Delete all portals not held over from prior transactions.
+ */
 void
 AtCleanup_Portals(void)
 {
diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c
index 079a51c474d0..a1f5c19ee976 100644
--- a/src/backend/utils/sort/tuplesortvariants.c
+++ b/src/backend/utils/sort/tuplesortvariants.c
@@ -1132,7 +1132,6 @@ tuplesort_getgintuple(Tuplesortstate *state, Size *len, bool forward)
  * efficient, but only safe for callers that are prepared to have any
  * subsequent manipulation of the tuplesort's state invalidate slot contents.
  * For byval Datums, the value of the 'copy' parameter has no effect.
-
  */
 bool
 tuplesort_getdatum(Tuplesortstate *state, bool forward, bool copy,
diff --git a/src/test/modules/test_custom_stats/t/001_custom_stats.pl b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
index e528595cfb0c..378ec22bbdf7 100644
--- a/src/test/modules/test_custom_stats/t/001_custom_stats.pl
+++ b/src/test/modules/test_custom_stats/t/001_custom_stats.pl
@@ -29,13 +29,13 @@ $node->safe_psql('postgres', q(CREATE EXTENSION test_custom_fixed_stats));
 
 # Create entries for variable-sized stats.
 $node->safe_psql('postgres',
-	q(select test_custom_stats_var_create('entry1')));
+	q(select test_custom_stats_var_create('entry1', 'Test entry 1')));
 $node->safe_psql('postgres',
-	q(select test_custom_stats_var_create('entry2')));
+	q(select test_custom_stats_var_create('entry2', 'Test entry 2')));
 $node->safe_psql('postgres',
-	q(select test_custom_stats_var_create('entry3')));
+	q(select test_custom_stats_var_create('entry3', 'Test entry 3')));
 $node->safe_psql('postgres',
-	q(select test_custom_stats_var_create('entry4')));
+	q(select test_custom_stats_var_create('entry4', 'Test entry 4')));
 
 # Update counters: entry1=2, entry2=3, entry3=2, entry4=3, fixed=3
 $node->safe_psql('postgres',
@@ -65,16 +65,28 @@ $node->safe_psql('postgres', q(select test_custom_stats_fixed_update()));
 # Test data reports.
 my $result = $node->safe_psql('postgres',
 	q(select * from test_custom_stats_var_report('entry1')));
-is($result, "entry1|2", "report for variable-sized data of entry1");
+is( $result,
+	"entry1|2|Test entry 1",
+	"report for variable-sized data of entry1");
+
 $result = $node->safe_psql('postgres',
 	q(select * from test_custom_stats_var_report('entry2')));
-is($result, "entry2|3", "report for variable-sized data of entry2");
+is( $result,
+	"entry2|3|Test entry 2",
+	"report for variable-sized data of entry2");
+
 $result = $node->safe_psql('postgres',
 	q(select * from test_custom_stats_var_report('entry3')));
-is($result, "entry3|2", "report for variable-sized data of entry3");
+is( $result,
+	"entry3|2|Test entry 3",
+	"report for variable-sized data of entry3");
+
 $result = $node->safe_psql('postgres',
 	q(select * from test_custom_stats_var_report('entry4')));
-is($result, "entry4|3", "report for variable-sized data of entry4");
+is( $result,
+	"entry4|3|Test entry 4",
+	"report for variable-sized data of entry4");
+
 $result = $node->safe_psql('postgres',
 	q(select * from test_custom_stats_fixed_report()));
 is($result, "3|", "report for fixed-sized stats");
@@ -97,7 +109,16 @@ $node->start();
 
 $result = $node->safe_psql('postgres',
 	q(select * from test_custom_stats_var_report('entry1')));
-is($result, "entry1|2", "variable-sized stats persist after clean restart");
+is( $result,
+	"entry1|2|Test entry 1",
+	"variable-sized stats persist after clean restart");
+
+$result = $node->safe_psql('postgres',
+	q(select * from test_custom_stats_var_report('entry2')));
+is( $result,
+	"entry2|3|Test entry 2",
+	"variable-sized stats persist after clean restart");
+
 $result = $node->safe_psql('postgres',
 	q(select * from test_custom_stats_fixed_report()));
 is($result, "3|", "fixed-sized stats persist after clean restart");
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql b/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
index d5f82b5d546e..5ed8cfc2dcf1 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats--1.0.sql
@@ -3,7 +3,7 @@
 -- complain if script is sourced in psql, rather than via CREATE EXTENSION
 \echo Use "CREATE EXTENSION test_custom_var_stats" to load this file. \quit
 
-CREATE FUNCTION test_custom_stats_var_create(IN name TEXT)
+CREATE FUNCTION test_custom_stats_var_create(IN name TEXT, in description TEXT)
 RETURNS void
 AS 'MODULE_PATHNAME', 'test_custom_stats_var_create'
 LANGUAGE C STRICT PARALLEL UNSAFE;
@@ -18,8 +18,9 @@ RETURNS void
 AS 'MODULE_PATHNAME', 'test_custom_stats_var_drop'
 LANGUAGE C STRICT PARALLEL UNSAFE;
 
-
-CREATE FUNCTION test_custom_stats_var_report(INOUT name TEXT, OUT calls BIGINT)
+CREATE FUNCTION test_custom_stats_var_report(INOUT name TEXT,
+                                             OUT calls BIGINT,
+                                             OUT description TEXT)
 RETURNS SETOF record
 AS 'MODULE_PATHNAME', 'test_custom_stats_var_report'
 LANGUAGE C STRICT PARALLEL UNSAFE;
diff --git a/src/test/modules/test_custom_stats/test_custom_var_stats.c b/src/test/modules/test_custom_stats/test_custom_var_stats.c
index d4905ab4ee99..9cc9ff8cd9f9 100644
--- a/src/test/modules/test_custom_stats/test_custom_var_stats.c
+++ b/src/test/modules/test_custom_stats/test_custom_var_stats.c
@@ -14,6 +14,7 @@
 
 #include "common/hashfn.h"
 #include "funcapi.h"
+#include "storage/dsm_registry.h"
 #include "utils/builtins.h"
 #include "utils/pgstat_internal.h"
 
@@ -22,6 +23,8 @@ PG_MODULE_MAGIC_EXT(
 					.version = PG_VERSION
 );
 
+#define TEST_CUSTOM_VAR_MAGIC_NUMBER (0xBEEFBEEF)
+
 /*--------------------------------------------------------------------------
  * Macros and constants
  *--------------------------------------------------------------------------
@@ -32,6 +35,9 @@ PG_MODULE_MAGIC_EXT(
  */
 #define PGSTAT_KIND_TEST_CUSTOM_VAR_STATS 25
 
+/* File paths for extra statistics data serialization */
+#define TEST_CUSTOM_EXTRA_DATA_DESC "pg_stat/test_custom_var_stats_desc.stats"
+
 /*
  * Hash statistic name to generate entry index for pgstat lookup.
  */
@@ -53,8 +59,23 @@ typedef struct PgStatShared_CustomVarEntry
 {
 	PgStatShared_Common header; /* standard pgstat entry header */
 	PgStat_StatCustomVarEntry stats;	/* custom statistics data */
+	dsa_pointer description;	/* extra statistics data */
 } PgStatShared_CustomVarEntry;
 
+/*--------------------------------------------------------------------------
+ * Global Variables
+ *--------------------------------------------------------------------------
+ */
+
+/* File handle for extra statistics data serialization */
+static FILE *fd_description = NULL;
+
+/* Current write offset in fd_description file */
+static long fd_description_offset = 0;
+
+/* DSA area for storing variable-length description strings */
+static dsa_area *custom_stats_description_dsa = NULL;
+
 /*--------------------------------------------------------------------------
  * Function prototypes
  *--------------------------------------------------------------------------
@@ -64,6 +85,19 @@ typedef struct PgStatShared_CustomVarEntry
 static bool test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref,
 												   bool nowait);
 
+/* Serialization callback: serialize extra statistics data */
+static void test_custom_stats_var_to_serialized_data(const PgStat_HashKey *key,
+													 const PgStatShared_Common *header,
+													 FILE *statfile);
+
+/* Deserialization callback: deserialize extra statistics data */
+static bool test_custom_stats_var_from_serialized_data(const PgStat_HashKey *key,
+													   const PgStatShared_Common *header,
+													   FILE *statfile);
+
+/* Cleanup callback: end of statistics file operations */
+static void test_custom_stats_var_finish(PgStat_StatsFileOp status);
+
 /*--------------------------------------------------------------------------
  * Custom kind configuration
  *--------------------------------------------------------------------------
@@ -80,6 +114,9 @@ static const PgStat_KindInfo custom_stats = {
 	.shared_data_len = sizeof(((PgStatShared_CustomVarEntry *) 0)->stats),
 	.pending_size = sizeof(PgStat_StatCustomVarEntry),
 	.flush_pending_cb = test_custom_stats_var_flush_pending_cb,
+	.to_serialized_data = test_custom_stats_var_to_serialized_data,
+	.from_serialized_data = test_custom_stats_var_from_serialized_data,
+	.finish = test_custom_stats_var_finish,
 };
 
 /*--------------------------------------------------------------------------
@@ -132,6 +169,309 @@ test_custom_stats_var_flush_pending_cb(PgStat_EntryRef *entry_ref, bool nowait)
 	return true;
 }
 
+/*
+ * test_custom_stats_var_to_serialized_data() -
+ *
+ * Serialize extra data (descriptions) for custom statistics entries to
+ * the statistics file.  Called during statistics file writing to preserve
+ * description strings across restarts.
+ *
+ * This callback writes a mix of data within the main pgstats file and a
+ * secondary file.  The following data is written to the main file for
+ * each entry:
+ * - An arbitrary magic number.
+ * - An offset.  This is used to know the location we need to look at
+ * to retrieve the information from the second file.
+ *
+ * The following data is written to the secondary file:
+ * - The entry key, cross-checked with the data from the main file
+ * when reloaded.
+ * - The length of the description.
+ * - The description data itself.
+ */
+static void
+test_custom_stats_var_to_serialized_data(const PgStat_HashKey *key,
+										 const PgStatShared_Common *header,
+										 FILE *statfile)
+{
+	char	   *description;
+	size_t		len;
+	PgStatShared_CustomVarEntry *entry = (PgStatShared_CustomVarEntry *) header;
+	bool		found;
+	uint32		magic_number = TEST_CUSTOM_VAR_MAGIC_NUMBER;
+
+	/*
+	 * First mark the main file with a magic number, keeping a trace that some
+	 * extra data will exist in the secondary file.
+	 */
+	pgstat_write_chunk_s(statfile, &magic_number);
+
+	/* Open statistics file for writing. */
+	if (!fd_description)
+	{
+		fd_description = AllocateFile(TEST_CUSTOM_EXTRA_DATA_DESC, PG_BINARY_W);
+		if (fd_description == NULL)
+		{
+			ereport(LOG,
+					(errcode_for_file_access(),
+					 errmsg("could not open statistics file \"%s\" for writing: %m",
+							TEST_CUSTOM_EXTRA_DATA_DESC)));
+			return;
+		}
+
+		/* Initialize offset for secondary file. */
+		fd_description_offset = 0;
+	}
+
+	/* Write offset to the main data file */
+	pgstat_write_chunk_s(statfile, &fd_description_offset);
+
+	/*
+	 * First write the entry key to the secondary file.  This will be
+	 * cross-checked with the key read from main stats file at loading time.
+	 */
+	pgstat_write_chunk_s(fd_description, (PgStat_HashKey *) key);
+	fd_description_offset += sizeof(PgStat_HashKey);
+
+	if (!custom_stats_description_dsa)
+		custom_stats_description_dsa = GetNamedDSA("test_custom_stat_dsa", &found);
+
+	/* Handle entries without descriptions */
+	if (!DsaPointerIsValid(entry->description) || !custom_stats_description_dsa)
+	{
+		/* length to description file */
+		len = 0;
+		pgstat_write_chunk_s(fd_description, &len);
+		fd_description_offset += sizeof(size_t);
+		return;
+	}
+
+	/*
+	 * Retrieve description from DSA, then write the length followed by the
+	 * description.
+	 */
+	description = dsa_get_address(custom_stats_description_dsa,
+								  entry->description);
+	len = strlen(description) + 1;
+	pgstat_write_chunk_s(fd_description, &len);
+	pgstat_write_chunk(fd_description, description, len);
+
+	/*
+	 * Update offset for next entry, counting for the length (size_t) of the
+	 * description and the description contents.
+	 */
+	fd_description_offset += len + sizeof(size_t);
+}
+
+/*
+ * test_custom_stats_var_from_serialized_data() -
+ *
+ * Read extra data (descriptions) for custom statistics entries from
+ * the statistics file.  This is called while loading the statistics
+ * at startup.
+ *
+ * See the top of test_custom_stats_var_to_serialized_data() for a
+ * detailed description of the data layout read here.
+ */
+static bool
+test_custom_stats_var_from_serialized_data(const PgStat_HashKey *key,
+										   const PgStatShared_Common *header,
+										   FILE *statfile)
+{
+	PgStatShared_CustomVarEntry *entry;
+	dsa_pointer dp;
+	size_t		len;
+	long		offset;
+	char	   *buffer;
+	bool		found;
+	uint32		magic_number = 0;
+	PgStat_HashKey file_key;
+
+	/* Check the magic number first, in the main file. */
+	if (!pgstat_read_chunk_s(statfile, &magic_number))
+	{
+		elog(WARNING, "failed to read magic number from statistics file");
+		return false;
+	}
+
+	if (magic_number != TEST_CUSTOM_VAR_MAGIC_NUMBER)
+	{
+		elog(WARNING, "found magic number %u from statistics file, should be %u",
+			 magic_number, TEST_CUSTOM_VAR_MAGIC_NUMBER);
+		return false;
+	}
+
+	/*
+	 * Read the offset from the main stats file, to be able to read the extra
+	 * data from the secondary file.
+	 */
+	if (!pgstat_read_chunk_s(statfile, &offset))
+	{
+		elog(WARNING, "failed to read metadata offset from statistics file");
+		return false;
+	}
+
+	/* Open statistics file for reading if not already open */
+	if (!fd_description)
+	{
+		fd_description = AllocateFile(TEST_CUSTOM_EXTRA_DATA_DESC, PG_BINARY_R);
+		if (fd_description == NULL)
+		{
+			if (errno != ENOENT)
+				ereport(LOG,
+						(errcode_for_file_access(),
+						 errmsg("could not open statistics file \"%s\" for reading: %m",
+								TEST_CUSTOM_EXTRA_DATA_DESC)));
+			pgstat_reset_of_kind(PGSTAT_KIND_TEST_CUSTOM_VAR_STATS);
+			return false;
+		}
+	}
+
+	/* Read data from the secondary file, at the specified offset */
+	if (fseek(fd_description, offset, SEEK_SET) != 0)
+	{
+		elog(WARNING, "failed to seek to offset %ld in description file", offset);
+		return false;
+	}
+
+	/* Read the hash key from the secondary file */
+	if (!pgstat_read_chunk_s(fd_description, &file_key))
+	{
+		elog(WARNING, "failed to read hash key from file");
+		return false;
+	}
+
+	/* Check key consistency */
+	if (file_key.kind != key->kind ||
+		file_key.dboid != key->dboid ||
+		file_key.objid != key->objid)
+	{
+		elog(WARNING, "found entry key %u/%u/%" PRIu64 " not matching with %u/%u/%" PRIu64,
+			 file_key.kind, file_key.dboid, file_key.objid,
+			 key->kind, key->dboid, key->objid);
+		return false;
+	}
+
+	entry = (PgStatShared_CustomVarEntry *) header;
+
+	/* Read the description length and its data */
+	if (!pgstat_read_chunk_s(fd_description, &len))
+	{
+		elog(WARNING, "failed to read metadata length from statistics file");
+		return false;
+	}
+
+	/* Handle empty descriptions */
+	if (len == 0)
+	{
+		entry->description = InvalidDsaPointer;
+		return true;
+	}
+
+	/* Initialize DSA if needed */
+	if (!custom_stats_description_dsa)
+		custom_stats_description_dsa = GetNamedDSA("test_custom_stat_dsa", &found);
+
+	if (!custom_stats_description_dsa)
+	{
+		elog(WARNING, "could not access DSA for custom statistics descriptions");
+		return false;
+	}
+
+	buffer = palloc(len);
+	if (!pgstat_read_chunk(fd_description, buffer, len))
+	{
+		pfree(buffer);
+		elog(WARNING, "failed to read description from file");
+		return false;
+	}
+
+	/* Allocate space in DSA and copy the description */
+	dp = dsa_allocate(custom_stats_description_dsa, len);
+	memcpy(dsa_get_address(custom_stats_description_dsa, dp), buffer, len);
+	entry->description = dp;
+	pfree(buffer);
+
+	return true;
+}
+
+/*
+ * test_custom_stats_var_finish() -
+ *
+ * Cleanup function called at the end of statistics file operations.
+ * Handles closing files and cleanup based on the operation type.
+ */
+static void
+test_custom_stats_var_finish(PgStat_StatsFileOp status)
+{
+	switch (status)
+	{
+		case STATS_WRITE:
+			if (!fd_description)
+				return;
+
+			fd_description_offset = 0;
+
+			/* Check for write errors and cleanup if necessary */
+			if (ferror(fd_description))
+			{
+				ereport(LOG,
+						(errcode_for_file_access(),
+						 errmsg("could not write statistics file \"%s\": %m",
+								TEST_CUSTOM_EXTRA_DATA_DESC)));
+				FreeFile(fd_description);
+				unlink(TEST_CUSTOM_EXTRA_DATA_DESC);
+			}
+			else if (FreeFile(fd_description) < 0)
+			{
+				ereport(LOG,
+						(errcode_for_file_access(),
+						 errmsg("could not close statistics file \"%s\": %m",
+								TEST_CUSTOM_EXTRA_DATA_DESC)));
+				unlink(TEST_CUSTOM_EXTRA_DATA_DESC);
+			}
+			break;
+
+		case STATS_READ:
+			if (fd_description)
+				FreeFile(fd_description);
+
+			/* Remove the temporary statistics file after reading */
+			elog(DEBUG2, "removing statistics file \"%s\"", TEST_CUSTOM_EXTRA_DATA_DESC);
+			unlink(TEST_CUSTOM_EXTRA_DATA_DESC);
+			break;
+
+		case STATS_DISCARD:
+			{
+				int			ret;
+
+				/* Attempt to remove the statistics file */
+				ret = unlink(TEST_CUSTOM_EXTRA_DATA_DESC);
+				if (ret != 0)
+				{
+					if (errno == ENOENT)
+						elog(LOG,
+							 "didn't need to unlink permanent stats file \"%s\" - didn't exist",
+							 TEST_CUSTOM_EXTRA_DATA_DESC);
+					else
+						ereport(LOG,
+								(errcode_for_file_access(),
+								 errmsg("could not unlink permanent statistics file \"%s\": %m",
+										TEST_CUSTOM_EXTRA_DATA_DESC)));
+				}
+				else
+				{
+					ereport(LOG,
+							(errmsg_internal("unlinked permanent statistics file \"%s\"",
+											 TEST_CUSTOM_EXTRA_DATA_DESC)));
+				}
+			}
+			break;
+	}
+
+	fd_description = NULL;
+}
+
 /*--------------------------------------------------------------------------
  * Helper functions
  *--------------------------------------------------------------------------
@@ -162,8 +502,7 @@ test_custom_stats_var_fetch_entry(const char *stat_name)
  * test_custom_stats_var_create
  *		Create new custom statistic entry
  *
- * Initializes a zero-valued statistics entry in shared memory.
- * Validates name length against NAMEDATALEN limit.
+ * Initializes a statistics entry with the given name and description.
  */
 PG_FUNCTION_INFO_V1(test_custom_stats_var_create);
 Datum
@@ -172,6 +511,9 @@ test_custom_stats_var_create(PG_FUNCTION_ARGS)
 	PgStat_EntryRef *entry_ref;
 	PgStatShared_CustomVarEntry *shared_entry;
 	char	   *stat_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	char	   *description = text_to_cstring(PG_GETARG_TEXT_PP(1));
+	dsa_pointer dp = InvalidDsaPointer;
+	bool		found;
 
 	/* Validate name length first */
 	if (strlen(stat_name) >= NAMEDATALEN)
@@ -180,6 +522,20 @@ test_custom_stats_var_create(PG_FUNCTION_ARGS)
 				 errmsg("custom statistic name \"%s\" is too long", stat_name),
 				 errdetail("Name must be less than %d characters.", NAMEDATALEN)));
 
+	/* Initialize DSA and description provided */
+	if (!custom_stats_description_dsa)
+		custom_stats_description_dsa = GetNamedDSA("test_custom_stat_dsa", &found);
+
+	if (!custom_stats_description_dsa)
+		ereport(ERROR,
+				(errmsg("could not access DSA for custom statistics descriptions")));
+
+	/* Allocate space in DSA and copy description */
+	dp = dsa_allocate(custom_stats_description_dsa, strlen(description) + 1);
+	memcpy(dsa_get_address(custom_stats_description_dsa, dp),
+		   description,
+		   strlen(description) + 1);
+
 	/* Create or get existing entry */
 	entry_ref = pgstat_get_entry_ref_locked(PGSTAT_KIND_TEST_CUSTOM_VAR_STATS, InvalidOid,
 											PGSTAT_CUSTOM_VAR_STATS_IDX(stat_name), true);
@@ -192,6 +548,9 @@ test_custom_stats_var_create(PG_FUNCTION_ARGS)
 	/* Zero-initialize statistics */
 	memset(&shared_entry->stats, 0, sizeof(shared_entry->stats));
 
+	/* Store description pointer */
+	shared_entry->description = dp;
+
 	pgstat_unlock_entry(entry_ref);
 
 	PG_RETURN_VOID();
@@ -226,8 +585,7 @@ test_custom_stats_var_update(PG_FUNCTION_ARGS)
  * test_custom_stats_var_drop
  *		Remove custom statistic entry
  *
- * Drops the named statistic from shared memory and requests
- * garbage collection if needed.
+ * Drops the named statistic from shared memory.
  */
 PG_FUNCTION_INFO_V1(test_custom_stats_var_drop);
 Datum
@@ -247,7 +605,7 @@ test_custom_stats_var_drop(PG_FUNCTION_ARGS)
  * test_custom_stats_var_report
  *		Retrieve custom statistic values
  *
- * Returns single row with statistic name and call count if the
+ * Returns single row with statistic name, call count, and description if the
  * statistic exists, otherwise returns no rows.
  */
 PG_FUNCTION_INFO_V1(test_custom_stats_var_report);
@@ -281,9 +639,13 @@ test_custom_stats_var_report(PG_FUNCTION_ARGS)
 
 	if (funcctx->call_cntr < funcctx->max_calls)
 	{
-		Datum		values[2];
-		bool		nulls[2] = {false, false};
+		Datum		values[3];
+		bool		nulls[3] = {false, false, false};
 		HeapTuple	tuple;
+		PgStat_EntryRef *entry_ref;
+		PgStatShared_CustomVarEntry *shared_entry;
+		char	   *description = NULL;
+		bool		found;
 
 		stat_name = text_to_cstring(PG_GETARG_TEXT_PP(0));
 		stat_entry = test_custom_stats_var_fetch_entry(stat_name);
@@ -291,9 +653,33 @@ test_custom_stats_var_report(PG_FUNCTION_ARGS)
 		/* Return row only if entry exists */
 		if (stat_entry)
 		{
+			/* Get entry ref to access shared entry */
+			entry_ref = pgstat_get_entry_ref(PGSTAT_KIND_TEST_CUSTOM_VAR_STATS, InvalidOid,
+											 PGSTAT_CUSTOM_VAR_STATS_IDX(stat_name), false, NULL);
+
+			if (entry_ref)
+			{
+				shared_entry = (PgStatShared_CustomVarEntry *) entry_ref->shared_stats;
+
+				/* Get description from DSA if available */
+				if (DsaPointerIsValid(shared_entry->description))
+				{
+					if (!custom_stats_description_dsa)
+						custom_stats_description_dsa = GetNamedDSA("test_custom_stat_dsa", &found);
+
+					if (custom_stats_description_dsa)
+						description = dsa_get_address(custom_stats_description_dsa, shared_entry->description);
+				}
+			}
+
 			values[0] = PointerGetDatum(cstring_to_text(stat_name));
 			values[1] = Int64GetDatum(stat_entry->numcalls);
 
+			if (description)
+				values[2] = PointerGetDatum(cstring_to_text(description));
+			else
+				nulls[2] = true;
+
 			tuple = heap_form_tuple(funcctx->tuple_desc, values, nulls);
 			SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(tuple));
 		}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 9dd65b102544..14f7f8217491 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2269,6 +2269,7 @@ PgStat_StatFuncEntry
 PgStat_StatReplSlotEntry
 PgStat_StatSubEntry
 PgStat_StatTabEntry
+PgStat_StatsFileOp
 PgStat_SubXactStatus
 PgStat_TableCounts
 PgStat_TableStatus
-- 
2.51.0

Attachment: signature.asc
Description: PGP signature

Reply via email to