On Thu, Dec 21, 2023 at 3:58 PM Melanie Plageman
<melanieplage...@gmail.com> wrote:
>
> On Wed, Dec 13, 2023 at 12:24 PM Robert Haas <robertmh...@gmail.com> wrote:
> > On Sat, Dec 9, 2023 at 5:12 AM Melanie Plageman
> > <melanieplage...@gmail.com> wrote:
> > > The goal is to keep pages frozen for at least target_freeze_duration.
> > > target_freeze_duration is in seconds and pages only have a last
> > > modification LSN, so target_freeze_duration must be converted to LSNs.
> > > To accomplish this, I've added an LSNTimeline data structure, containing
> > > XLogRecPtr, TimestampTz pairs stored with decreasing precision as they
> > > age. When we need to translate the guc value to LSNs, we linearly
> > > interpolate it on this timeline. For the time being, the global
> > > LSNTimeline is in PgStat_WalStats and is only updated by vacuum. There
> > > is no reason it can't be updated with some other cadence and/or by some
> > > other process (nothing about it is inherently tied to vacuum). The
> > > cached translated value of target_freeze_duration is stored in each
> > > table's stats. This is arbitrary as it is not a table-level stat.
> > > However, it needs to be located somewhere that is accessible on
> > > update/delete. We may want to recalculate it more often than once per
> > > table vacuum, especially in case of long-running vacuums.
> >
> > This part sounds like it isn't quite baked yet. The idea of the data
> > structure seems fine, but updating it once per vacuum sounds fairly
> > unprincipled to me? Don't we want the updates to happen on a somewhat
> > regular wall clock cadence?
>
> Yes, this part was not fully baked. I actually discussed this with
> Andres at PGConf EU last week and he suggested that background writer
> update the LSNTimeline. He also suggested I propose the LSNTimeline in
> a new thread. I could add a pageinspect function returning the
> estimated time of last page modification given the page LSN (so it is
> proposed with a user).

I've rebased this over top of the revised LSNTimeline functionality
proposed separately in [1].
It is also registered for the current commitfest.

I plan to add the decay logic and benchmark it this week.

- Melanie

[1] 
https://www.postgresql.org/message-id/CAAKRu_Z7tR7D1=DR=xwqdefyk_nu_gxgw88x0htxn6ask-8...@mail.gmail.com
From 6fe9f26ebe6bd773e631bcdf0b7ea928a4ca8e27 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Wed, 27 Dec 2023 16:40:27 -0500
Subject: [PATCH v3 02/12] Add LSNTimeline for converting LSN <-> time

Add a new structure, LSNTimeline, consisting of LSNTimes -- each an LSN,
time pair. Each LSNTime can represent multiple logical LSN, time pairs,
referred to as members. LSN <-> time conversions can be done using
linear interpolation with two LSNTimes on the LSNTimeline.

This commit does not add a global instance of LSNTimeline. It adds the
structures and functions needed to maintain and access such a timeline.
---
 src/backend/utils/activity/pgstat_wal.c | 199 ++++++++++++++++++++++++
 src/include/pgstat.h                    |  34 ++++
 src/tools/pgindent/typedefs.list        |   2 +
 3 files changed, 235 insertions(+)

diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c
index 6a81b781357..ba40aad258a 100644
--- a/src/backend/utils/activity/pgstat_wal.c
+++ b/src/backend/utils/activity/pgstat_wal.c
@@ -17,8 +17,11 @@
 
 #include "postgres.h"
 
+#include "access/xlog.h"
 #include "utils/pgstat_internal.h"
 #include "executor/instrument.h"
+#include "utils/builtins.h"
+#include "utils/timestamp.h"
 
 
 PgStat_PendingWalStats PendingWalStats = {0};
@@ -32,6 +35,12 @@ PgStat_PendingWalStats PendingWalStats = {0};
 static WalUsage prevWalUsage;
 
 
+static void lsntime_absorb(LSNTime *a, const LSNTime *b);
+void lsntime_insert(LSNTimeline *timeline, TimestampTz time, XLogRecPtr lsn);
+
+XLogRecPtr estimate_lsn_at_time(const LSNTimeline *timeline, TimestampTz time);
+TimestampTz estimate_time_at_lsn(const LSNTimeline *timeline, XLogRecPtr lsn);
+
 /*
  * Calculate how much WAL usage counters have increased and update
  * shared WAL and IO statistics.
@@ -184,3 +193,193 @@ pgstat_wal_snapshot_cb(void)
 		   sizeof(pgStatLocal.snapshot.wal));
 	LWLockRelease(&stats_shmem->lock);
 }
+
+/*
+ * Set *a to be the earlier of *a or *b.
+ */
+static void
+lsntime_absorb(LSNTime *a, const LSNTime *b)
+{
+	LSNTime		result;
+	int			new_members = a->members + b->members;
+
+	if (a->time < b->time)
+		result = *a;
+	else if (b->time < a->time)
+		result = *b;
+	else if (a->lsn < b->lsn)
+		result = *a;
+	else if (b->lsn < a->lsn)
+		result = *b;
+	else
+		result = *a;
+
+	*a = result;
+	a->members = new_members;
+}
+
+/*
+ * Insert a new LSNTime into the LSNTimeline in the first element with spare
+ * capacity.
+ */
+void
+lsntime_insert(LSNTimeline *timeline, TimestampTz time,
+			   XLogRecPtr lsn)
+{
+	LSNTime		temp;
+	LSNTime		carry = {.lsn = lsn,.time = time,.members = 1};
+
+	for (int i = 0; i < timeline->length; i++)
+	{
+		bool		full;
+		LSNTime    *cur = &timeline->data[i];
+
+		/*
+		 * An array element's capacity to represent members is 2 ^ its
+		 * position in the array.
+		 */
+		full = cur->members >= (1 << i);
+
+		/*
+		 * If the current element is not yet at capacity, then insert the
+		 * passed-in LSNTime into this element by taking the smaller of the it
+		 * and the current LSNTime element. This is required to ensure that
+		 * time moves forward on the timeline.
+		 */
+		if (!full)
+		{
+			Assert(cur->members == carry.members);
+			Assert(cur->members + carry.members <= 1 << i);
+			lsntime_absorb(cur, &carry);
+			return;
+		}
+
+		/*
+		 * If the current element is full, ensure that the inserting LSNTime
+		 * is larger than the current element. This must be true for time to
+		 * move forward on the timeline.
+		 */
+		Assert(carry.lsn >= cur->lsn || carry.time >= cur->time);
+
+		/*
+		 * If the element is at capacity, swap the element with the carry and
+		 * continue on to find an element with space to represent the new
+		 * member.
+		 */
+		temp = *cur;
+		*cur = carry;
+		carry = temp;
+	}
+
+	/*
+	 * Time to use another element in the array -- and increase the length in
+	 * the process
+	 */
+	timeline->data[timeline->length] = carry;
+	timeline->length++;
+}
+
+
+/*
+ * Translate time to a LSN using the provided timeline. The timeline will not
+ * be modified.
+ */
+XLogRecPtr
+estimate_lsn_at_time(const LSNTimeline *timeline, TimestampTz time)
+{
+	TimestampTz time_elapsed;
+	XLogRecPtr	lsns_elapsed;
+	double		result;
+
+	LSNTime		start = {.time = PgStartTime,.lsn = PgStartLSN};
+	LSNTime		end = {.time = GetCurrentTimestamp(),.lsn = GetXLogInsertRecPtr()};
+
+	/*
+	 * If the target time is after the current time, our best estimate of the
+	 * LSN is the current insert LSN.
+	 */
+	if (time >= end.time)
+		return end.lsn;
+
+	for (int i = 0; i < timeline->length; i++)
+	{
+		/* Pass times more recent than our target time */
+		if (timeline->data[i].time > time)
+			continue;
+
+		/* Found the first element before our target time */
+		start = timeline->data[i];
+
+		/*
+		 * If there is only one element in the array, use the current time as
+		 * the end of the range. Otherwise it is the element preceding our
+		 * start.
+		 */
+		if (i > 0)
+			end = timeline->data[i - 1];
+		break;
+	}
+
+	time_elapsed = end.time - start.time;
+	Assert(time_elapsed != 0);
+
+	lsns_elapsed = end.lsn - start.lsn;
+	Assert(lsns_elapsed != 0);
+
+	result = (double) (time - start.time) / time_elapsed * lsns_elapsed + start.lsn;
+	if (result < 0)
+		return InvalidXLogRecPtr;
+	return result;
+}
+
+/*
+ * Translate lsn to a time using the provided timeline. The timeline will not
+ * be modified.
+ */
+TimestampTz
+estimate_time_at_lsn(const LSNTimeline *timeline, XLogRecPtr lsn)
+{
+	TimestampTz time_elapsed;
+	XLogRecPtr	lsns_elapsed;
+	TimestampTz result;
+
+	LSNTime		start = {.time = PgStartTime,.lsn = PgStartLSN};
+	LSNTime		end = {.time = GetCurrentTimestamp(),.lsn = GetXLogInsertRecPtr()};
+
+	/*
+	 * If the target LSN is after the current insert LSN, the current time is
+	 * our best estimate.
+	 */
+	if (lsn >= end.lsn)
+		return end.time;
+
+	for (int i = 0; i < timeline->length; i++)
+	{
+		/* Pass LSNs more recent than our target LSN */
+		if (timeline->data[i].lsn > lsn)
+			continue;
+
+		/* Found the first element before our target LSN */
+		start = timeline->data[i];
+
+		/*
+		 * If there is only one element in the array, use the current LSN and
+		 * time as the end of the range. Otherwise, use the preceding element
+		 * (the first element occuring before our target LSN in the timeline).
+		 */
+		if (i > 0)
+			end = timeline->data[i - 1];
+		break;
+	}
+
+	time_elapsed = end.time - start.time;
+	Assert(time_elapsed != 0);
+
+	lsns_elapsed = end.lsn - start.lsn;
+	Assert(lsns_elapsed != 0);
+
+	result = (lsn - start.lsn) / lsns_elapsed * time_elapsed + start.time;
+	if (result < 0)
+		return 0;
+	return result;
+}
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ab91b3b367d..ddbe320bf3e 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -11,6 +11,7 @@
 #ifndef PGSTAT_H
 #define PGSTAT_H
 
+#include "access/xlogdefs.h"
 #include "datatype/timestamp.h"
 #include "portability/instr_time.h"
 #include "postmaster/pgarch.h"	/* for MAX_XFN_CHARS */
@@ -428,6 +429,39 @@ typedef struct PgStat_StatTabEntry
 	PgStat_Counter autoanalyze_count;
 } PgStat_StatTabEntry;
 
+/*
+ * The elements of an LSNTimeline. Each LSNTime represents one or more time,
+ * LSN pairs. The LSN is typically the insert LSN recorded at the time. Members
+ * is the number of logical members -- each a time, LSN pair -- represented in
+ * the LSNTime.
+ */
+typedef struct LSNTime
+{
+	TimestampTz time;
+	XLogRecPtr	lsn;
+	uint64		members;
+} LSNTime;
+
+/*
+ * A timeline consists of LSNTimes from most to least recent. Each element of
+ * the array in the timeline may represent 2^array index logical members --
+ * meaning that each element's capacity is twice that of the preceding element.
+ * This gives more recent times greater precision than less recent ones. An
+ * array of size 64 should provide sufficient capacity without accounting for
+ * what to do when all elements of the array are at capacity.
+ *
+ * When LSNTimes are inserted into the timeline, they are absorbed into the
+ * first array element with spare capacity -- with the new combined element
+ * having the lesser of the two values. The timeline's length is the highest
+ * array index representing one or more logical members. Use the timeline for
+ * LSN <-> time conversion using linear interpolation.
+ */
+typedef struct LSNTimeline
+{
+	int			length;
+	LSNTime		data[64];
+} LSNTimeline;
+
 typedef struct PgStat_WalStats
 {
 	PgStat_Counter wal_records;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 5f1a017d2dc..49014d0367e 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1520,6 +1520,8 @@ LogicalTapeSet
 LsnReadQueue
 LsnReadQueueNextFun
 LsnReadQueueNextStatus
+LSNTime
+LSNTimeline
 LtreeGistOptions
 LtreeSignature
 MAGIC
-- 
2.37.2

From c28334f588cbe8535bf45e272295a52850b2ff95 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Wed, 27 Dec 2023 16:41:47 -0500
Subject: [PATCH v3 03/12] Add LSNTimeline to PgStat_WalStats

Add a globally maintained instance of the new LSNTimeline to
PgStat_WalStats and add utility functions for maintaining and accessing
it. This commit does not insert new values to the timeline or use the
helpers to access it.
---
 src/backend/utils/activity/pgstat_wal.c | 48 +++++++++++++++++++++----
 src/include/pgstat.h                    |  6 ++++
 2 files changed, 48 insertions(+), 6 deletions(-)

diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c
index ba40aad258a..594185acb97 100644
--- a/src/backend/utils/activity/pgstat_wal.c
+++ b/src/backend/utils/activity/pgstat_wal.c
@@ -36,10 +36,10 @@ static WalUsage prevWalUsage;
 
 
 static void lsntime_absorb(LSNTime *a, const LSNTime *b);
-void lsntime_insert(LSNTimeline *timeline, TimestampTz time, XLogRecPtr lsn);
+static void lsntime_insert(LSNTimeline *timeline, TimestampTz time, XLogRecPtr lsn);
 
-XLogRecPtr estimate_lsn_at_time(const LSNTimeline *timeline, TimestampTz time);
-TimestampTz estimate_time_at_lsn(const LSNTimeline *timeline, XLogRecPtr lsn);
+static XLogRecPtr estimate_lsn_at_time(const LSNTimeline *timeline, TimestampTz time);
+static TimestampTz estimate_time_at_lsn(const LSNTimeline *timeline, XLogRecPtr lsn);
 
 /*
  * Calculate how much WAL usage counters have increased and update
@@ -222,7 +222,7 @@ lsntime_absorb(LSNTime *a, const LSNTime *b)
  * Insert a new LSNTime into the LSNTimeline in the first element with spare
  * capacity.
  */
-void
+static void
 lsntime_insert(LSNTimeline *timeline, TimestampTz time,
 			   XLogRecPtr lsn)
 {
@@ -284,7 +284,7 @@ lsntime_insert(LSNTimeline *timeline, TimestampTz time,
  * Translate time to a LSN using the provided timeline. The timeline will not
  * be modified.
  */
-XLogRecPtr
+static XLogRecPtr
 estimate_lsn_at_time(const LSNTimeline *timeline, TimestampTz time)
 {
 	TimestampTz time_elapsed;
@@ -336,7 +336,7 @@ estimate_lsn_at_time(const LSNTimeline *timeline, TimestampTz time)
  * Translate lsn to a time using the provided timeline. The timeline will not
  * be modified.
  */
-TimestampTz
+static TimestampTz
 estimate_time_at_lsn(const LSNTimeline *timeline, XLogRecPtr lsn)
 {
 	TimestampTz time_elapsed;
@@ -383,3 +383,39 @@ estimate_time_at_lsn(const LSNTimeline *timeline, XLogRecPtr lsn)
 		return 0;
 	return result;
 }
+
+XLogRecPtr
+pgstat_wal_estimate_lsn_at_time(TimestampTz time)
+{
+	XLogRecPtr	result;
+	PgStatShared_Wal *stats_shmem = &pgStatLocal.shmem->wal;
+
+	LWLockAcquire(&stats_shmem->lock, LW_SHARED);
+	result = estimate_lsn_at_time(&stats_shmem->stats.timeline, time);
+	LWLockRelease(&stats_shmem->lock);
+
+	return result;
+}
+
+TimestampTz
+pgstat_wal_estimate_time_at_lsn(XLogRecPtr lsn)
+{
+	TimestampTz result;
+	PgStatShared_Wal *stats_shmem = &pgStatLocal.shmem->wal;
+
+	LWLockAcquire(&stats_shmem->lock, LW_SHARED);
+	result = estimate_time_at_lsn(&stats_shmem->stats.timeline, lsn);
+	LWLockRelease(&stats_shmem->lock);
+
+	return result;
+}
+
+void
+pgstat_wal_update_lsntimeline(TimestampTz time, XLogRecPtr lsn)
+{
+	PgStatShared_Wal *stats_shmem = &pgStatLocal.shmem->wal;
+
+	LWLockAcquire(&stats_shmem->lock, LW_EXCLUSIVE);
+	lsntime_insert(&stats_shmem->stats.timeline, time, lsn);
+	LWLockRelease(&stats_shmem->lock);
+}
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index ddbe320bf3e..dd914e606eb 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -472,6 +472,7 @@ typedef struct PgStat_WalStats
 	PgStat_Counter wal_sync;
 	PgStat_Counter wal_write_time;
 	PgStat_Counter wal_sync_time;
+	LSNTimeline timeline;
 	TimestampTz stat_reset_timestamp;
 } PgStat_WalStats;
 
@@ -754,6 +755,11 @@ extern void pgstat_execute_transactional_drops(int ndrops, struct xl_xact_stats_
 extern void pgstat_report_wal(bool force);
 extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
 
+/* Helpers for maintaining the LSNTimeline */
+extern XLogRecPtr pgstat_wal_estimate_lsn_at_time(TimestampTz time);
+extern TimestampTz pgstat_wal_estimate_time_at_lsn(XLogRecPtr lsn);
+extern void pgstat_wal_update_lsntimeline(TimestampTz time, XLogRecPtr lsn);
+
 
 /*
  * Variables in pgstat.c
-- 
2.37.2

From b56f47674313fb423d2faf610cdad41f5874fa87 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Tue, 5 Dec 2023 07:21:34 -0500
Subject: [PATCH v3 05/12] visibilitymap_set/clear() return previous vm bits

Modify visibilitymap_set() and visibilitymap_clear() to return the state
of the block's all frozen and all visible bits prior to modification.
Previously, visibilitymap_set() returned only whether or not provided
bits were cleared and visibilitymap_clear() returned nothing.
It can be useful to have the status of both of the VM bits prior to
modification.
---
 src/backend/access/heap/heapam.c        | 28 +++++++++++++------------
 src/backend/access/heap/visibilitymap.c | 18 +++++++++-------
 src/include/access/visibilitymap.h      | 10 ++++-----
 3 files changed, 31 insertions(+), 25 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index f9387153595..fa21a5a09a0 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -3561,10 +3561,11 @@ l2:
 		 * overhead would be unchanged, that doesn't seem necessarily
 		 * worthwhile.
 		 */
-		if (PageIsAllVisible(page) &&
-			visibilitymap_clear(relation, block, vmbuffer,
-								VISIBILITYMAP_ALL_FROZEN))
-			cleared_all_frozen = true;
+		if (PageIsAllVisible(page))
+		{
+			cleared_all_frozen = visibilitymap_clear(relation, block, vmbuffer,
+								VISIBILITYMAP_ALL_FROZEN) & VISIBILITYMAP_ALL_FROZEN;
+		}
 
 		MarkBufferDirty(buffer);
 
@@ -4758,11 +4759,11 @@ failed:
 		tuple->t_data->t_ctid = *tid;
 
 	/* Clear only the all-frozen bit on visibility map if needed */
-	if (PageIsAllVisible(page) &&
-		visibilitymap_clear(relation, block, vmbuffer,
-							VISIBILITYMAP_ALL_FROZEN))
-		cleared_all_frozen = true;
-
+	if (PageIsAllVisible(page))
+	{
+		cleared_all_frozen = visibilitymap_clear(relation, block, vmbuffer,
+							VISIBILITYMAP_ALL_FROZEN) & VISIBILITYMAP_ALL_FROZEN;
+	}
 
 	MarkBufferDirty(*buffer);
 
@@ -5512,10 +5513,11 @@ l4:
 								  xid, mode, false,
 								  &new_xmax, &new_infomask, &new_infomask2);
 
-		if (PageIsAllVisible(BufferGetPage(buf)) &&
-			visibilitymap_clear(rel, block, vmbuffer,
-								VISIBILITYMAP_ALL_FROZEN))
-			cleared_all_frozen = true;
+		if (PageIsAllVisible(BufferGetPage(buf)))
+		{
+			cleared_all_frozen = visibilitymap_clear(rel, block, vmbuffer,
+								VISIBILITYMAP_ALL_FROZEN) & VISIBILITYMAP_ALL_FROZEN;
+		}
 
 		START_CRIT_SECTION();
 
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 2e18cd88bcf..5586b727fde 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -134,9 +134,9 @@ static Buffer vm_extend(Relation rel, BlockNumber vm_nblocks);
  *
  * You must pass a buffer containing the correct map page to this function.
  * Call visibilitymap_pin first to pin the right one. This function doesn't do
- * any I/O.  Returns true if any bits have been cleared and false otherwise.
+ * any I/O.  Returns the visibility map status before clearing the bits.
  */
-bool
+uint8
 visibilitymap_clear(Relation rel, BlockNumber heapBlk, Buffer vmbuf, uint8 flags)
 {
 	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
@@ -144,7 +144,7 @@ visibilitymap_clear(Relation rel, BlockNumber heapBlk, Buffer vmbuf, uint8 flags
 	int			mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
 	uint8		mask = flags << mapOffset;
 	char	   *map;
-	bool		cleared = false;
+	uint8		status;
 
 	/* Must never clear all_visible bit while leaving all_frozen bit set */
 	Assert(flags & VISIBILITYMAP_VALID_BITS);
@@ -160,17 +160,18 @@ visibilitymap_clear(Relation rel, BlockNumber heapBlk, Buffer vmbuf, uint8 flags
 	LockBuffer(vmbuf, BUFFER_LOCK_EXCLUSIVE);
 	map = PageGetContents(BufferGetPage(vmbuf));
 
+	status = ((map[mapByte] >> mapOffset) & VISIBILITYMAP_VALID_BITS);
+
 	if (map[mapByte] & mask)
 	{
 		map[mapByte] &= ~mask;
 
 		MarkBufferDirty(vmbuf);
-		cleared = true;
 	}
 
 	LockBuffer(vmbuf, BUFFER_LOCK_UNLOCK);
 
-	return cleared;
+	return status;
 }
 
 /*
@@ -240,9 +241,9 @@ visibilitymap_pin_ok(BlockNumber heapBlk, Buffer vmbuf)
  *
  * You must pass a buffer containing the correct map page to this function.
  * Call visibilitymap_pin first to pin the right one. This function doesn't do
- * any I/O.
+ * any I/O. Returns the visibility map status before setting the bits.
  */
-void
+uint8
 visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
 				  XLogRecPtr recptr, Buffer vmBuf, TransactionId cutoff_xid,
 				  uint8 flags)
@@ -252,6 +253,7 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
 	uint8		mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
 	Page		page;
 	uint8	   *map;
+	uint8		status;
 
 #ifdef TRACE_VISIBILITYMAP
 	elog(DEBUG1, "vm_set %s %d", RelationGetRelationName(rel), heapBlk);
@@ -276,6 +278,7 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
 	map = (uint8 *) PageGetContents(page);
 	LockBuffer(vmBuf, BUFFER_LOCK_EXCLUSIVE);
 
+	status = ((map[mapByte] >> mapOffset) & VISIBILITYMAP_VALID_BITS);
 	if (flags != (map[mapByte] >> mapOffset & VISIBILITYMAP_VALID_BITS))
 	{
 		START_CRIT_SECTION();
@@ -313,6 +316,7 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
 	}
 
 	LockBuffer(vmBuf, BUFFER_LOCK_UNLOCK);
+	return status;
 }
 
 /*
diff --git a/src/include/access/visibilitymap.h b/src/include/access/visibilitymap.h
index daaa01a2578..29608d4a7ab 100644
--- a/src/include/access/visibilitymap.h
+++ b/src/include/access/visibilitymap.h
@@ -26,14 +26,14 @@
 #define VM_ALL_FROZEN(r, b, v) \
 	((visibilitymap_get_status((r), (b), (v)) & VISIBILITYMAP_ALL_FROZEN) != 0)
 
-extern bool visibilitymap_clear(Relation rel, BlockNumber heapBlk,
-								Buffer vmbuf, uint8 flags);
+extern uint8 visibilitymap_clear(Relation rel, BlockNumber heapBlk,
+								 Buffer vmbuf, uint8 flags);
 extern void visibilitymap_pin(Relation rel, BlockNumber heapBlk,
 							  Buffer *vmbuf);
 extern bool visibilitymap_pin_ok(BlockNumber heapBlk, Buffer vmbuf);
-extern void visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
-							  XLogRecPtr recptr, Buffer vmBuf, TransactionId cutoff_xid,
-							  uint8 flags);
+extern uint8 visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
+							   XLogRecPtr recptr, Buffer vmBuf, TransactionId cutoff_xid,
+							   uint8 flags);
 extern uint8 visibilitymap_get_status(Relation rel, BlockNumber heapBlk, Buffer *vmbuf);
 extern void visibilitymap_count(Relation rel, BlockNumber *all_visible, BlockNumber *all_frozen);
 extern BlockNumber visibilitymap_prepare_truncate(Relation rel,
-- 
2.37.2

From 6090a0a34d001a0a5238676aa935d20fd6d2365c Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Wed, 27 Dec 2023 16:32:40 -0500
Subject: [PATCH v3 04/12] Bgwriter maintains global LSNTimeline

Insert new LSN, time pairs to the global LSNTimeline stored in
PgStat_WalStats in the background writer's main loop. This ensures that
new values are added to the timeline in a regular manner.
---
 src/backend/postmaster/bgwriter.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index d02dc17b9c1..9a2ef869b53 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -277,6 +277,7 @@ BackgroundWriterMain(void)
 		{
 			TimestampTz timeout = 0;
 			TimestampTz now = GetCurrentTimestamp();
+			XLogRecPtr	current_lsn = GetLastImportantRecPtr();
 
 			timeout = TimestampTzPlusMilliseconds(last_snapshot_ts,
 												  LOG_SNAPSHOT_INTERVAL_MS);
@@ -289,10 +290,11 @@ BackgroundWriterMain(void)
 			 * the end of the record.
 			 */
 			if (now >= timeout &&
-				last_snapshot_lsn <= GetLastImportantRecPtr())
+				last_snapshot_lsn <= current_lsn)
 			{
 				last_snapshot_lsn = LogStandbySnapshot();
 				last_snapshot_ts = now;
+				pgstat_wal_update_lsntimeline(now, current_lsn);
 			}
 		}
 
-- 
2.37.2

From af55fbdbd91c71e643e4a5c6b1ea86a3b304a95d Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Tue, 5 Dec 2023 07:29:39 -0500
Subject: [PATCH v3 01/12] Record LSN at postmaster startup

The insert_lsn at postmaster startup can be used along with PgStartTime
as seed values for a timeline mapping LSNs to time. Future commits will
add such a structure for LSN <-> time conversions. A start LSN allows
for such conversions before even inserting a value into the timeline.
The current time and current insert LSN can be used along with
PgStartTime and PgStartLSN.

This is WIP, as I'm not sure if I did this in the right place.
---
 src/backend/access/transam/xlog.c   | 2 ++
 src/backend/postmaster/postmaster.c | 1 +
 src/include/utils/builtins.h        | 3 +++
 3 files changed, 6 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 1264849883b..aa71e502e4b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -146,6 +146,8 @@ bool		XLOG_DEBUG = false;
 
 int			wal_segment_size = DEFAULT_XLOG_SEG_SIZE;
 
+XLogRecPtr	PgStartLSN = InvalidXLogRecPtr;
+
 /*
  * Number of WAL insertion locks to use. A higher value allows more insertions
  * to happen concurrently, but adds some CPU overhead to flushing the WAL,
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index fb04e4dde31..d0a6a7106af 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1448,6 +1448,7 @@ PostmasterMain(int argc, char *argv[])
 	 * Remember postmaster startup time
 	 */
 	PgStartTime = GetCurrentTimestamp();
+	PgStartLSN = GetXLogInsertRecPtr();
 
 	/*
 	 * Report postmaster status in the postmaster.pid file, to allow pg_ctl to
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 2f8b46d6da3..0cb24e10e61 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -17,6 +17,7 @@
 #include "fmgr.h"
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
+#include "access/xlogdefs.h"
 
 /* Sign + the most decimal digits an 8-byte number could have */
 #define MAXINT8LEN 20
@@ -82,6 +83,8 @@ extern void generate_operator_clause(fmStringInfo buf,
 									 Oid opoid,
 									 const char *rightop, Oid rightoptype);
 
+extern PGDLLIMPORT XLogRecPtr PgStartLSN;
+
 /* varchar.c */
 extern int	bpchartruelen(char *s, int len);
 
-- 
2.37.2

From cd535fd571cf56a459d7689d7832c1c47d02df16 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Fri, 8 Dec 2023 14:48:54 -0500
Subject: [PATCH v3 08/12] Count table modification VM clears

When a page formerly marked all visible in the visibility map is
modified, check if it remained unmodified for at least
target_freeze_duration. If it was modified sooner than
target_freeze_duration, it is an early unset. We want to keep track of
the number and age of pages which are modified before
target_freeze_duration so that vacuum can predict whether or not a page
it is considering opportunistically freezing is likely to be modified
before target_freeze_duration has elapsed.

This commit adds only the tracking of unsets upon page modification, not
the use of these stats by vacuum. The target_freeze_duration is
specified in seconds and must be translated to LSNs. This is done at the
beginning of each table vacuum. The translated target_freeze_duration in
LSNs is cached in the the table-level stats. Since the LSN consumption
rate is global, there is no reason to store the translated value in
table-level stats. It should be moved somewhere else.
---
 src/backend/access/heap/heapam.c             |  86 +++++++---
 src/backend/access/heap/vacuumlazy.c         |   2 +
 src/backend/utils/activity/pgstat_relation.c | 159 +++++++++++++++++++
 src/include/pgstat.h                         |  50 ++++++
 src/tools/pgindent/typedefs.list             |   1 +
 5 files changed, 277 insertions(+), 21 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index fa21a5a09a0..f50379b96ca 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2532,6 +2532,9 @@ heap_delete(Relation relation, ItemPointer tid,
 	bool		have_tuple_lock = false;
 	bool		iscombo;
 	bool		all_visible_cleared = false;
+	uint8		old_vmbits = 0;
+	XLogRecPtr	insert_lsn = InvalidXLogRecPtr;
+	XLogRecPtr	page_lsn = InvalidXLogRecPtr;
 	HeapTuple	old_key_tuple = NULL;	/* replica identity of the tuple */
 	bool		old_key_copied = false;
 
@@ -2793,8 +2796,8 @@ l1:
 	{
 		all_visible_cleared = true;
 		PageClearAllVisible(page);
-		visibilitymap_clear(relation, BufferGetBlockNumber(buffer),
-							vmbuffer, VISIBILITYMAP_VALID_BITS);
+		old_vmbits = visibilitymap_clear(relation, BufferGetBlockNumber(buffer),
+										 vmbuffer, VISIBILITYMAP_VALID_BITS);
 	}
 
 	/* store transaction information of xact deleting the tuple */
@@ -2875,7 +2878,8 @@ l1:
 		/* filtering by origin on a row level is much more efficient */
 		XLogSetRecordFlags(XLOG_INCLUDE_ORIGIN);
 
-		recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_DELETE);
+		insert_lsn = recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_DELETE);
+		page_lsn = PageGetLSN(page);
 
 		PageSetLSN(page, recptr);
 	}
@@ -2920,6 +2924,9 @@ l1:
 
 	pgstat_count_heap_delete(relation);
 
+	if (old_vmbits & VISIBILITYMAP_ALL_VISIBLE)
+		pgstat_count_vm_unset(relation, page_lsn, insert_lsn, old_vmbits);
+
 	if (old_key_tuple != NULL && old_key_copied)
 		heap_freetuple(old_key_tuple);
 
@@ -3026,6 +3033,12 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
 				infomask_new_tuple,
 				infomask2_new_tuple;
 
+	uint8		old_page_old_vmbits = 0;
+	uint8		new_page_old_vmbits = 0;
+	XLogRecPtr	old_page_lsn = InvalidXLogRecPtr;
+	XLogRecPtr	new_page_lsn = InvalidXLogRecPtr;
+	XLogRecPtr	insert_lsn = InvalidXLogRecPtr;
+
 	Assert(ItemPointerIsValid(otid));
 
 	/* Cheap, simplistic check that the tuple matches the rel's rowtype. */
@@ -3563,8 +3576,9 @@ l2:
 		 */
 		if (PageIsAllVisible(page))
 		{
-			cleared_all_frozen = visibilitymap_clear(relation, block, vmbuffer,
-								VISIBILITYMAP_ALL_FROZEN) & VISIBILITYMAP_ALL_FROZEN;
+			old_page_old_vmbits = visibilitymap_clear(relation, block, vmbuffer,
+													  VISIBILITYMAP_ALL_FROZEN);
+			cleared_all_frozen = old_page_old_vmbits & VISIBILITYMAP_ALL_FROZEN;
 		}
 
 		MarkBufferDirty(buffer);
@@ -3584,7 +3598,8 @@ l2:
 			xlrec.flags =
 				cleared_all_frozen ? XLH_LOCK_ALL_FROZEN_CLEARED : 0;
 			XLogRegisterData((char *) &xlrec, SizeOfHeapLock);
-			recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_LOCK);
+			insert_lsn = recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_LOCK);
+			old_page_lsn = PageGetLSN(page);
 			PageSetLSN(page, recptr);
 		}
 
@@ -3793,15 +3808,15 @@ l2:
 	{
 		all_visible_cleared = true;
 		PageClearAllVisible(BufferGetPage(buffer));
-		visibilitymap_clear(relation, BufferGetBlockNumber(buffer),
-							vmbuffer, VISIBILITYMAP_VALID_BITS);
+		old_page_old_vmbits = visibilitymap_clear(relation, BufferGetBlockNumber(buffer),
+												  vmbuffer, VISIBILITYMAP_VALID_BITS);
 	}
 	if (newbuf != buffer && PageIsAllVisible(BufferGetPage(newbuf)))
 	{
 		all_visible_cleared_new = true;
 		PageClearAllVisible(BufferGetPage(newbuf));
-		visibilitymap_clear(relation, BufferGetBlockNumber(newbuf),
-							vmbuffer_new, VISIBILITYMAP_VALID_BITS);
+		new_page_old_vmbits = visibilitymap_clear(relation, BufferGetBlockNumber(newbuf),
+												  vmbuffer_new, VISIBILITYMAP_VALID_BITS);
 	}
 
 	if (newbuf != buffer)
@@ -3823,15 +3838,18 @@ l2:
 			log_heap_new_cid(relation, heaptup);
 		}
 
-		recptr = log_heap_update(relation, buffer,
-								 newbuf, &oldtup, heaptup,
-								 old_key_tuple,
-								 all_visible_cleared,
-								 all_visible_cleared_new);
+		insert_lsn = recptr = log_heap_update(relation, buffer,
+											  newbuf, &oldtup, heaptup,
+											  old_key_tuple,
+											  all_visible_cleared,
+											  all_visible_cleared_new);
 		if (newbuf != buffer)
 		{
+			new_page_lsn = PageGetLSN(BufferGetPage(newbuf));
 			PageSetLSN(BufferGetPage(newbuf), recptr);
 		}
+
+		old_page_lsn = PageGetLSN(BufferGetPage(buffer));
 		PageSetLSN(BufferGetPage(buffer), recptr);
 	}
 
@@ -3866,6 +3884,15 @@ l2:
 	if (have_tuple_lock)
 		UnlockTupleTuplock(relation, &(oldtup.t_self), *lockmode);
 
+	if (old_page_old_vmbits & VISIBILITYMAP_ALL_VISIBLE)
+		pgstat_count_vm_unset(relation, old_page_lsn,
+							  insert_lsn, old_page_old_vmbits);
+
+	/* MTODO: figure out if we whether or not to count unfreezing new page */
+	if (newbuf != buffer &&
+		new_page_old_vmbits & VISIBILITYMAP_ALL_VISIBLE)
+		pgstat_count_vm_unset(relation, new_page_lsn, insert_lsn, new_page_old_vmbits);
+
 	pgstat_count_heap_update(relation, use_hot_update, newbuf != buffer);
 
 	/*
@@ -4162,6 +4189,10 @@ heap_lock_tuple(Relation relation, HeapTuple tuple,
 	bool		skip_tuple_lock = false;
 	bool		have_tuple_lock = false;
 	bool		cleared_all_frozen = false;
+	XLogRecPtr	insert_lsn = InvalidXLogRecPtr;
+	XLogRecPtr	page_lsn = InvalidXLogRecPtr;
+	uint8		old_vmbits = 0;
+
 
 	*buffer = ReadBuffer(relation, ItemPointerGetBlockNumber(tid));
 	block = ItemPointerGetBlockNumber(tid);
@@ -4761,8 +4792,9 @@ failed:
 	/* Clear only the all-frozen bit on visibility map if needed */
 	if (PageIsAllVisible(page))
 	{
-		cleared_all_frozen = visibilitymap_clear(relation, block, vmbuffer,
-							VISIBILITYMAP_ALL_FROZEN) & VISIBILITYMAP_ALL_FROZEN;
+		old_vmbits = visibilitymap_clear(relation, block, vmbuffer,
+										 VISIBILITYMAP_ALL_FROZEN);
+		cleared_all_frozen = old_vmbits & VISIBILITYMAP_ALL_FROZEN;
 	}
 
 	MarkBufferDirty(*buffer);
@@ -4796,7 +4828,8 @@ failed:
 
 		/* we don't decode row locks atm, so no need to log the origin */
 
-		recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_LOCK);
+		insert_lsn = recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_LOCK);
+		page_lsn = PageGetLSN(page);
 
 		PageSetLSN(page, recptr);
 	}
@@ -4824,6 +4857,9 @@ out_unlocked:
 	if (have_tuple_lock)
 		UnlockTupleTuplock(relation, tid, mode);
 
+	if (old_vmbits & VISIBILITYMAP_ALL_VISIBLE)
+		pgstat_count_vm_unset(relation, page_lsn, insert_lsn, old_vmbits);
+
 	return result;
 }
 
@@ -5276,6 +5312,9 @@ heap_lock_updated_tuple_rec(Relation rel, ItemPointer tid, TransactionId xid,
 				new_xmax;
 	TransactionId priorXmax = InvalidTransactionId;
 	bool		cleared_all_frozen = false;
+	uint8		old_vmbits = 0;
+	XLogRecPtr	page_lsn = InvalidXLogRecPtr;
+	XLogRecPtr	insert_lsn = InvalidXLogRecPtr;
 	bool		pinned_desired_page;
 	Buffer		vmbuffer = InvalidBuffer;
 	BlockNumber block;
@@ -5515,8 +5554,9 @@ l4:
 
 		if (PageIsAllVisible(BufferGetPage(buf)))
 		{
-			cleared_all_frozen = visibilitymap_clear(rel, block, vmbuffer,
-								VISIBILITYMAP_ALL_FROZEN) & VISIBILITYMAP_ALL_FROZEN;
+			old_vmbits = visibilitymap_clear(rel, block, vmbuffer,
+											 VISIBILITYMAP_ALL_FROZEN);
+			cleared_all_frozen = old_vmbits & VISIBILITYMAP_ALL_FROZEN;
 		}
 
 		START_CRIT_SECTION();
@@ -5548,7 +5588,8 @@ l4:
 
 			XLogRegisterData((char *) &xlrec, SizeOfHeapLockUpdated);
 
-			recptr = XLogInsert(RM_HEAP2_ID, XLOG_HEAP2_LOCK_UPDATED);
+			insert_lsn = recptr = XLogInsert(RM_HEAP2_ID, XLOG_HEAP2_LOCK_UPDATED);
+			page_lsn = PageGetLSN(page);
 
 			PageSetLSN(page, recptr);
 		}
@@ -5581,6 +5622,9 @@ out_unlocked:
 	if (vmbuffer != InvalidBuffer)
 		ReleaseBuffer(vmbuffer);
 
+	if (old_vmbits & VISIBILITYMAP_ALL_VISIBLE)
+		pgstat_count_vm_unset(rel, page_lsn, insert_lsn, old_vmbits);
+
 	return result;
 }
 
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 3b9299b8924..c788e7e5ca4 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -489,6 +489,8 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 							vacrel->relname)));
 	}
 
+	pgstat_refresh_frz_dur(RelationGetRelid(rel), rel->rd_rel->relisshared);
+
 	/*
 	 * Allocate dead_items array memory using dead_items_alloc.  This handles
 	 * parallel VACUUM initialization as part of allocating shared memory
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index f5d726e2921..5048112408b 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -18,6 +18,7 @@
 #include "postgres.h"
 
 #include "access/twophase_rmgr.h"
+#include "access/visibilitymapdefs.h"
 #include "access/xact.h"
 #include "catalog/partition.h"
 #include "postmaster/autovacuum.h"
@@ -205,6 +206,162 @@ pgstat_drop_relation(Relation rel)
 	}
 }
 
+
+/*
+ * The first time a page is modified after having been set all visible, we
+ * check the duration it was unmodified against the target_freeze_duration. The
+ * page has only an LSN, not a timestmap, so we must translate the page LSN to
+ * time using the LSNTimeline. Because the LSN consumption rate can change, we
+ * want to refresh this translated value periodically. Doing so at the start of
+ * each table vacuum is convenient.
+ */
+void
+pgstat_refresh_frz_dur(Oid tableoid, bool shared)
+{
+	PgStat_EntryRef *entry_ref;
+	PgStat_StatTabEntry *tabentry;
+	TimestampTz cur_time;
+	XLogRecPtr	cur_lsn;
+	TimestampTz target_time;
+	XLogRecPtr	target_lsn;
+	uint64		target_dur_usecs;
+	Oid			dboid = (shared ? InvalidOid : MyDatabaseId);
+
+	if (!pgstat_track_counts)
+		return;
+
+	target_dur_usecs = target_freeze_duration * USECS_PER_SEC;
+
+	cur_time = GetCurrentTimestamp();
+
+	/*
+	 * We can afford to acquire exact (not approximate) insert LSN at the
+	 * start of each relation vacuum. The translation of the GUC value to time
+	 * will be more accurate.
+	 */
+	cur_lsn = GetXLogInsertRecPtr();
+
+	/*
+	 * How long ago would a page have to have been set all visible for it to
+	 * qualify as having remained unmodified for target_freeze_duration. It
+	 * shouldn't happen that current time - target_freeze_duration is less
+	 * than zero, but TimestampTz is signed, so we better do this check.
+	 */
+	target_time = target_dur_usecs >= cur_time ? 0 : cur_time - target_dur_usecs;
+
+	/*
+	 * Use the global LSNTimeline stored in WAL statistics to translate the
+	 * target_time into an LSN based on our LSN consumption over that period.
+	 */
+	target_lsn = pgstat_wal_estimate_lsn_at_time(target_time);
+
+	/* Now get the table-level stats */
+	entry_ref = pgstat_get_entry_ref_locked(PGSTAT_KIND_RELATION,
+											dboid, tableoid, false);
+
+	Assert(entry_ref != NULL && entry_ref->shared_stats != NULL);
+
+	tabentry = &((PgStatShared_Relation *) entry_ref->shared_stats)->stats;
+
+	/*
+	 * Update the translated value of target_freeze_duration so that table
+	 * modifications use a fresh value when determining whether or not a page
+	 * was modified sooner than target_freeze_duration after having been set
+	 * all visible. There is no reason for this to be cached at the table
+	 * level, but it is easiest to keep it here for now.
+	 */
+	tabentry->target_frz_dur_lsns = cur_lsn - target_lsn;
+
+	pgstat_unlock_entry(entry_ref);
+
+	/*
+	 * MTODO: would like to flush table stats so future unsets use
+	 * target_frz_dur_lsns new value. However pgstat_report_stat() can't be
+	 * called here due to being in a transaction. Is there some other way to
+	 * do this?
+	 */
+
+	return;
+}
+
+
+/*
+ * Upon update, delete, or tuple lock, if the page being modified was
+ * previously set all visible in the visibility map, check and record whether
+ * or not the page has remained unmodified for longer than
+ * target_freeze_duration. Record both the page age and the page's former
+ * status in the VM. The distribution of ages can be used to predict whether or
+ * not a given page is likely to remain unmodified for longer than
+ * target_freeze_duration.
+ *
+ * Note that we do not count inserts as unsets even when they are modifying a
+ * formerly all visible page. This is because vacuum updates the freespace map
+ * after pruning, freezing, and reaping dead tuples. The next insert is likely
+ * to be to this page, so the page's age at insert is unrelated to the page
+ * modification pattern and will only reflect that vacuum made space available
+ * on the page.
+ */
+void
+pgstat_count_vm_unset(Relation relation, XLogRecPtr page_lsn,
+					  XLogRecPtr current_lsn, uint8 old_vmbits)
+{
+	PgStat_StatTabEntry *tabentry;
+	XLogRecPtr	target_frz_duration;
+	XLogRecPtr	vm_duration_lsns;
+	PgStat_TableCounts *tabcounts;
+
+	/*
+	 * Can't be all frozen without being all visible and we shouldn't call
+	 * this function if all bits were unset
+	 */
+	Assert(old_vmbits & VISIBILITYMAP_ALL_VISIBLE);
+
+	if (!pgstat_track_counts)
+		return;
+
+	tabentry = pgstat_fetch_stat_tabentry_ext(relation->rd_rel->relisshared,
+											  RelationGetRelid(relation));
+
+	/*
+	 * MTODO: Where can we cache this such that it is easy to get here but not
+	 * table-level?
+	 */
+	target_frz_duration = tabentry->target_frz_dur_lsns;
+
+	vm_duration_lsns = current_lsn - page_lsn;
+	vm_duration_lsns = Max(vm_duration_lsns, 0);
+
+	tabcounts = &relation->pgstat_info->counts;
+
+	tabcounts->unsets.unvis++;
+
+	/*
+	 * If the page is being modified before target_freeze_duration, count it
+	 * as an unset for vacuum freeze statistics. We want to determine the
+	 * likelihood that a page being vacuumed will be modified before that
+	 * amount of time has elapsed, irrespective of whether or not we got it
+	 * right last vacuum.
+	 */
+	if (vm_duration_lsns < target_frz_duration)
+		accumulator_insert(&tabcounts->unsets.early_unsets, vm_duration_lsns);
+
+	/*
+	 * If it was frozen and modified before the target duration, it is an
+	 * early unfreeze. If it was not frozen and remained unmodified for longer
+	 * than the target duration, it is a missed opportunity to freeze.
+	 */
+	if (old_vmbits & VISIBILITYMAP_ALL_FROZEN)
+	{
+		tabcounts->unsets.vm_unfreezes++;
+
+		if (vm_duration_lsns < target_frz_duration)
+			tabcounts->unsets.early_unfreezes++;
+	}
+	else if (vm_duration_lsns >= target_frz_duration)
+		tabcounts->unsets.missed_freezes++;
+}
+
+
 /*
  * Report that the table was just vacuumed and flush IO statistics.
  */
@@ -845,6 +1002,8 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 	tabentry->tuples_hot_updated += lstats->counts.tuples_hot_updated;
 	tabentry->tuples_newpage_updated += lstats->counts.tuples_newpage_updated;
 
+	pgstat_unset_absorb(&tabentry->vm_unset, &lstats->counts.unsets);
+
 	/*
 	 * If table was truncated/dropped, first reset the live/dead counters.
 	 */
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index e5cceb0237b..a1945e6a5e8 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -202,6 +202,44 @@ accumulator_calculate(PgStat_Accumulator *accumulator, double *mean,
 	*stddev = sqrt((accumulator->q - pow(accumulator->s, 2) / accumulator->n) / accumulator->n);
 }
 
+typedef struct PgStat_VMUnset
+{
+	/* times a page marked frozen in the VM was modified */
+	int64		vm_unfreezes;
+	/* times a page was unfrozen before target_freeze_duration elapsed */
+	int64		early_unfreezes;
+	/* times a page marked all visible in the VM was modified */
+	int64		unvis;
+
+	/*
+	 * times a page only marked all visible and not all frozen in the VM
+	 * remained unmodified for longer than target_freeze_duration
+	 */
+	int64		missed_freezes;
+
+	/*
+	 * times that pages marked either all visible or all visible and all
+	 * frozen in the VM were modified before target_freeze_duration elapsed.
+	 * The accumulator tracks their ages as well as occurrences. We include
+	 * pages which were marked all visible but not all frozen because we care
+	 * about how long pages remain unmodified in general. If we only counted
+	 * the ages of early unfreezes, it would skew our data based on our own
+	 * failure to freeze the right pages.
+	 */
+	PgStat_Accumulator early_unsets;
+} PgStat_VMUnset;
+
+static inline void
+pgstat_unset_absorb(PgStat_VMUnset *target, PgStat_VMUnset *source)
+{
+	target->vm_unfreezes += source->vm_unfreezes;
+	target->early_unfreezes += source->early_unfreezes;
+	target->unvis += source->unvis;
+	target->missed_freezes += source->missed_freezes;
+
+	accumulator_absorb(&target->early_unsets, &source->early_unsets);
+}
+
 /* ----------
  * PgStat_TableCounts			The actual per-table counts kept by a backend
  *
@@ -236,6 +274,8 @@ typedef struct PgStat_TableCounts
 	PgStat_Counter tuples_newpage_updated;
 	bool		truncdropped;
 
+	PgStat_VMUnset unsets;
+
 	PgStat_Counter delta_live_tuples;
 	PgStat_Counter delta_dead_tuples;
 	PgStat_Counter changed_tuples;
@@ -491,6 +531,11 @@ typedef struct PgStat_StatTabEntry
 	PgStat_Counter analyze_count;
 	TimestampTz last_autoanalyze_time;	/* autovacuum initiated */
 	PgStat_Counter autoanalyze_count;
+
+	/* calculated at vac start and used upon unset */
+	XLogRecPtr	target_frz_dur_lsns;
+	/* updated upon VM unset */
+	PgStat_VMUnset vm_unset;
 } PgStat_StatTabEntry;
 
 /*
@@ -695,6 +740,11 @@ extern void pgstat_report_analyze(Relation rel,
 								  PgStat_Counter livetuples, PgStat_Counter deadtuples,
 								  bool resetcounter);
 
+extern void pgstat_refresh_frz_dur(Oid tableoid, bool shared);
+
+extern void pgstat_count_vm_unset(Relation relation, XLogRecPtr page_lsn,
+								  XLogRecPtr current_lsn, uint8 old_vmbits);
+
 /*
  * If stats are enabled, but pending data hasn't been prepared yet, call
  * pgstat_assoc_relation() to do so. See its comment for why this is done
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index cf8086616b2..890a091d426 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2091,6 +2091,7 @@ PgStat_SubXactStatus
 PgStat_TableCounts
 PgStat_TableStatus
 PgStat_TableXactStatus
+PgStat_VMUnset
 PgStat_WalStats
 PgXmlErrorContext
 PgXmlStrictness
-- 
2.37.2

From b385acc271f88b0cd9fd901a82a211fca39ed224 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Tue, 5 Dec 2023 07:41:06 -0500
Subject: [PATCH v3 07/12] Add guc target_freeze_duration

Add target_freeze_duration, a guc specifying the minimum time in
seconds a page should stay frozen. This will be used to measure and
control vacuum's opportunistic page freezing behavior in future commits.
---
 src/backend/utils/init/globals.c              |  1 +
 src/backend/utils/misc/guc_tables.c           | 11 +++++++++++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/miscadmin.h                       |  1 +
 4 files changed, 14 insertions(+)

diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 60bc1217fb4..89bad737203 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -149,6 +149,7 @@ int			VacuumCostPageMiss = 2;
 int			VacuumCostPageDirty = 20;
 int			VacuumCostLimit = 200;
 double		VacuumCostDelay = 0;
+int			target_freeze_duration = 1;
 
 int64		VacuumPageHit = 0;
 int64		VacuumPageMiss = 0;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 3945a92dddd..2c40e420e53 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2488,6 +2488,17 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"target_freeze_duration", PGC_USERSET, AUTOVACUUM,
+			gettext_noop("minimum amount of time in seconds that a page should stay frozen."),
+			NULL,
+			GUC_UNIT_S
+		},
+		&target_freeze_duration,
+		1, 0, 10000000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"max_files_per_process", PGC_POSTMASTER, RESOURCES_KERNEL,
 			gettext_noop("Sets the maximum number of simultaneously open files for each server process."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index b2809c711a1..9e8e0c1a9db 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -669,6 +669,7 @@
 #autovacuum_vacuum_cost_limit = -1	# default vacuum cost limit for
 					# autovacuum, -1 means use
 					# vacuum_cost_limit
+#target_freeze_duration = 1 # desired time for page to stay frozen in seconds
 
 
 #------------------------------------------------------------------------------
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 74bc2f97cbd..b76c2015376 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -279,6 +279,7 @@ extern PGDLLIMPORT int VacuumCostPageMiss;
 extern PGDLLIMPORT int VacuumCostPageDirty;
 extern PGDLLIMPORT int VacuumCostLimit;
 extern PGDLLIMPORT double VacuumCostDelay;
+extern PGDLLIMPORT int target_freeze_duration;
 
 extern PGDLLIMPORT int64 VacuumPageHit;
 extern PGDLLIMPORT int64 VacuumPageMiss;
-- 
2.37.2

From 7b676650bafc7062eac7240b32254562e666a428 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Fri, 8 Dec 2023 14:39:17 -0500
Subject: [PATCH v3 06/12] Add accumulator to calculate normal dist online

Add PgStat_Accumulator which tracks the information required to
calculate a mean and standard deviation online. This is useful for
storing stats whose values are normally distributed.
---
 src/include/pgstat.h             | 64 ++++++++++++++++++++++++++++++++
 src/tools/pgindent/typedefs.list |  1 +
 2 files changed, 65 insertions(+)

diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index dd914e606eb..e5cceb0237b 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -11,6 +11,8 @@
 #ifndef PGSTAT_H
 #define PGSTAT_H
 
+#include <math.h>
+
 #include "access/xlogdefs.h"
 #include "datatype/timestamp.h"
 #include "portability/instr_time.h"
@@ -138,6 +140,68 @@ typedef struct PgStat_BackendSubEntry
 	PgStat_Counter sync_error_count;
 } PgStat_BackendSubEntry;
 
+/*
+ * Used both in backend local and shared memory, this accumulator keeps track of
+ * the counters needed to calculate a mean and standard deviation online.
+ */
+typedef struct PgStat_Accumulator
+{
+	/* Number of values in this accumulator */
+	uint64		n;
+
+	/* Sum of values */
+	double		s;
+
+	/* Sum of squared values */
+	double		q;
+} PgStat_Accumulator;
+
+static inline void
+accumulator_insert(PgStat_Accumulator *accumulator, double v)
+{
+	accumulator->n++;
+	accumulator->s += v;
+	accumulator->q += pow(v, 2);
+}
+
+static inline double
+accumulator_remove(PgStat_Accumulator *accumulator)
+{
+	double		result;
+
+	Assert(accumulator->n > 0);
+
+	result = accumulator->s / accumulator->n;
+
+	accumulator->n--;
+	accumulator->s -= result;
+	accumulator->q -= pow(result, 2);
+
+	return result;
+}
+
+static inline void
+accumulator_absorb(PgStat_Accumulator *target, PgStat_Accumulator *source)
+{
+	target->n += source->n;
+	target->s += source->s;
+	target->q += source->q;
+}
+
+static inline void
+accumulator_calculate(PgStat_Accumulator *accumulator, double *mean,
+					  double *stddev)
+{
+	*mean = NAN;
+	*stddev = INFINITY;
+
+	if (accumulator->n == 0)
+		return;
+
+	*mean = accumulator->s / accumulator->n;
+	*stddev = sqrt((accumulator->q - pow(accumulator->s, 2) / accumulator->n) / accumulator->n);
+}
+
 /* ----------
  * PgStat_TableCounts			The actual per-table counts kept by a backend
  *
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 49014d0367e..cf8086616b2 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2058,6 +2058,7 @@ PgStatShared_ReplSlot
 PgStatShared_SLRU
 PgStatShared_Subscription
 PgStatShared_Wal
+PgStat_Accumulator
 PgStat_ArchiverStats
 PgStat_BackendSubEntry
 PgStat_BgWriterStats
-- 
2.37.2

From 300d8bc3df9354cf7cbd29c47988d413856ae398 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Fri, 8 Dec 2023 16:32:46 -0500
Subject: [PATCH v3 09/12] Opportunistically freeze pages unlikely to be
 modified

While vacuuming, we will opportunistically freeze a page if its age
means it is unlikely to be modified in the future before
target_freeze_duration.

Each time an all-visible page is modified, i.e. its all-visible bit is
unset, and that modification is considered "early", the duration (in
LSNs) that that page spent all-visible is entered into the early unsets
accumulator. At the beginning of vacuuming a relation, we will extract
the mean and standard deviation from this accumulated data. We then
calculate the youngest a page can be and still have a 95% likelihood of
remaining unmodified for target_freeze_duration. Pages older than this
are frozen by vacuum.

This commit includes a guc, opp_freeze_algo, which is for development
only and lets us compare different heuristics.
---
 src/backend/access/heap/vacuumlazy.c          | 58 ++++++++++++++--
 src/backend/utils/activity/pgstat_relation.c  | 68 +++++++++++++++++--
 src/backend/utils/init/globals.c              |  1 +
 src/backend/utils/misc/guc_tables.c           | 10 +++
 src/backend/utils/misc/postgresql.conf.sample |  1 +
 src/include/miscadmin.h                       |  3 +
 src/include/pgstat.h                          |  2 +-
 7 files changed, 134 insertions(+), 9 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c788e7e5ca4..6bd64b15996 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -210,6 +210,14 @@ typedef struct LVRelState
 	int64		live_tuples;	/* # live tuples remaining */
 	int64		recently_dead_tuples;	/* # dead, but not yet removable */
 	int64		missed_dead_tuples; /* # removable, but not removed */
+
+	/*
+	 * The youngest page we predict will stay unmodified for
+	 * target_freeze_duration. We will not opportunistically freeze pages
+	 * younger than this threshold. This is calculated at the beginning of
+	 * vacuuming a relation.
+	 */
+	XLogRecPtr	frz_threshold_min;
 } LVRelState;
 
 /*
@@ -250,6 +258,7 @@ static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   bool sharelock, Buffer vmbuffer);
 static void lazy_scan_prune(LVRelState *vacrel, Buffer buf,
 							BlockNumber blkno, Page page,
+							XLogRecPtr page_lsn,
 							LVPagePruneState *prunestate);
 static bool lazy_scan_noprune(LVRelState *vacrel, Buffer buf,
 							  BlockNumber blkno, Page page,
@@ -287,6 +296,9 @@ static void update_vacuum_error_info(LVRelState *vacrel,
 static void restore_vacuum_error_info(LVRelState *vacrel,
 									  const LVSavedErrInfo *saved_vacrel);
 
+static bool vacuum_opp_freeze(LVRelState *vacrel, XLogRecPtr page_lsn,
+							  bool all_visible_all_frozen,
+							  bool prune_emitted_fpi);
 
 /*
  *	heap_vacuum_rel() -- perform VACUUM for one heap relation
@@ -489,7 +501,8 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 							vacrel->relname)));
 	}
 
-	pgstat_refresh_frz_dur(RelationGetRelid(rel), rel->rd_rel->relisshared);
+	vacrel->frz_threshold_min = pgstat_refresh_frz_stats(RelationGetRelid(rel),
+														 rel->rd_rel->relisshared);
 
 	/*
 	 * Allocate dead_items array memory using dead_items_alloc.  This handles
@@ -855,6 +868,7 @@ lazy_scan_heap(LVRelState *vacrel)
 	{
 		Buffer		buf;
 		Page		page;
+		XLogRecPtr	page_lsn = InvalidXLogRecPtr;
 		bool		all_visible_according_to_vm;
 		LVPagePruneState prunestate;
 
@@ -959,6 +973,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
 								 vacrel->bstrategy);
 		page = BufferGetPage(buf);
+		page_lsn = PageGetLSN(page);
 		if (!ConditionalLockBufferForCleanup(buf))
 		{
 			bool		hastup,
@@ -1021,7 +1036,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * were pruned some time earlier.  Also considers freezing XIDs in the
 		 * tuple headers of remaining items with storage.
 		 */
-		lazy_scan_prune(vacrel, buf, blkno, page, &prunestate);
+		lazy_scan_prune(vacrel, buf, blkno, page, page_lsn, &prunestate);
 
 		Assert(!prunestate.all_visible || !prunestate.has_lpdead_items);
 
@@ -1545,6 +1560,7 @@ lazy_scan_prune(LVRelState *vacrel,
 				Buffer buf,
 				BlockNumber blkno,
 				Page page,
+				XLogRecPtr page_lsn,
 				LVPagePruneState *prunestate)
 {
 	Relation	rel = vacrel->rel;
@@ -1786,8 +1802,9 @@ lazy_scan_prune(LVRelState *vacrel,
 	 * page all-frozen afterwards (might not happen until final heap pass).
 	 */
 	if (pagefrz.freeze_required || tuples_frozen == 0 ||
-		(prunestate->all_visible && prunestate->all_frozen &&
-		 fpi_before != pgWalUsage.wal_fpi))
+		vacuum_opp_freeze(vacrel, page_lsn,
+						  prunestate->all_visible && prunestate->all_frozen,
+						  fpi_before != pgWalUsage.wal_fpi))
 	{
 		/*
 		 * We're freezing the page.  Our final NewRelfrozenXid doesn't need to
@@ -3509,3 +3526,36 @@ restore_vacuum_error_info(LVRelState *vacrel,
 	vacrel->offnum = saved_vacrel->offnum;
 	vacrel->phase = saved_vacrel->phase;
 }
+
+/*
+ * Determine whether or not vacuum should opportunistically freeze a page.
+ * Given freeze statistics about the relation contained in LVRelState, whether
+ * or not the page will be able to be marked all visible and all frozen, and
+ * whether or not pruning emitted an FPI, return whether or not the page should
+ * be frozen. The LVRelState should not be modified.
+ */
+static bool
+vacuum_opp_freeze(LVRelState *vacrel, XLogRecPtr page_lsn,
+				  bool all_visible_all_frozen,
+				  bool prune_emitted_fpi)
+{
+	int64		page_age;
+
+	if (!all_visible_all_frozen)
+		return false;
+
+	page_age = GetInsertRecPtr() - page_lsn;
+	page_age = Max(page_age, 0);
+
+	if (opp_freeze_algo == 0)
+		return prune_emitted_fpi;
+
+	if (opp_freeze_algo == 4)
+	{
+		if (vacrel->frz_threshold_min == InvalidXLogRecPtr)
+			return true;
+		return page_age > vacrel->frz_threshold_min;
+	}
+
+	return false;
+}
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index 5048112408b..ef137466932 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -44,6 +44,16 @@ typedef struct TwoPhasePgStatRecord
 	bool		truncdropped;	/* was the relation truncated/dropped? */
 } TwoPhasePgStatRecord;
 
+/*
+ * The Z-score used to calculate the freeze threshold from the distribution of
+ * early unsets. See:
+ *
+ * https://en.wikipedia.org/wiki/Standard_normal_table#Cumulative_(less_than_Z)
+ *
+ * This Z-score has a cumulative probability (from negative infinity) of
+ * approximately 0.94950, or 94.950%.
+ */
+static const double FRZ_THRESHOLD_ZSCORE = 1.64;
 
 static PgStat_TableStatus *pgstat_prep_relation_pending(Oid rel_id, bool isshared);
 static void add_tabstat_xact_level(PgStat_TableStatus *pgstat_info, int nest_level);
@@ -215,8 +225,8 @@ pgstat_drop_relation(Relation rel)
  * want to refresh this translated value periodically. Doing so at the start of
  * each table vacuum is convenient.
  */
-void
-pgstat_refresh_frz_dur(Oid tableoid, bool shared)
+XLogRecPtr
+pgstat_refresh_frz_stats(Oid tableoid, bool shared)
 {
 	PgStat_EntryRef *entry_ref;
 	PgStat_StatTabEntry *tabentry;
@@ -225,10 +235,14 @@ pgstat_refresh_frz_dur(Oid tableoid, bool shared)
 	TimestampTz target_time;
 	XLogRecPtr	target_lsn;
 	uint64		target_dur_usecs;
+	double		mean;
+	double		stddev;
+	double		n;
 	Oid			dboid = (shared ? InvalidOid : MyDatabaseId);
+	XLogRecPtr	frz_threshold_min;
 
 	if (!pgstat_track_counts)
-		return;
+		return InvalidXLogRecPtr;
 
 	target_dur_usecs = target_freeze_duration * USECS_PER_SEC;
 
@@ -272,6 +286,52 @@ pgstat_refresh_frz_dur(Oid tableoid, bool shared)
 	 */
 	tabentry->target_frz_dur_lsns = cur_lsn - target_lsn;
 
+	/*
+	 * Calculate the mean and standard deviation of the distribution of early
+	 * unsets.
+	 *
+	 * Each time an all-visible page is modified, i.e. its all-visible bit is
+	 * unset, and that modification is considered "early", the duration (in
+	 * LSNs) that that page spent all-visible is entered into the early unsets
+	 * accumulator. Here, the data collected in that accumulator is extracted
+	 * into the parameters of a normal distribution (mean and standard
+	 * deviation).
+	 */
+	accumulator_calculate(&tabentry->vm_unset.early_unsets, &mean, &stddev);
+
+	/*
+	 * Calculate the age of the youngest page that should be opportunistically
+	 * frozen.
+	 *
+	 * We'll opportunistically freeze a page if the probability that it will
+	 * be early unset is less than approximately 5%. This threshold occurs
+	 * when the cumulative distribution function of the early unsets
+	 * distribution exceeds 95%. We assume that if a page has survived past
+	 * the age when 95% of early unsets have occurred, then it's safe to
+	 * freeze.
+	 *
+	 * If we couldn't produce a distribution from the accumulator, or the
+	 * standard deviation of that distribution is infinite, then err on the
+	 * side of freezing everything.
+	 */
+	n = mean + FRZ_THRESHOLD_ZSCORE * stddev;
+	if (isnan(n) || isinf(n))
+		frz_threshold_min = InvalidXLogRecPtr;
+	else
+		frz_threshold_min = n;
+
+	/*
+	 * If the number of entries in the accumulator is small, then the mean and
+	 * standard deviation extracted from it may be unreliable. We can probably
+	 * devise a way to represent low confidence using a modifier. For example,
+	 * we could skew the mean and standard deviation to favor more freezing
+	 * (perhaps using standard error). The internet says that a sample size >=
+	 * 30ish is required for the central limit theorem to hold. So, before we
+	 * have 30 unsets, just freeze everything on the given vacuum.
+	 */
+	if (tabentry->vm_unset.early_unsets.n < 30)
+		frz_threshold_min = InvalidXLogRecPtr;
+
 	pgstat_unlock_entry(entry_ref);
 
 	/*
@@ -281,7 +341,7 @@ pgstat_refresh_frz_dur(Oid tableoid, bool shared)
 	 * do this?
 	 */
 
-	return;
+	return frz_threshold_min;
 }
 
 
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 89bad737203..0f5e5077da6 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -149,6 +149,7 @@ int			VacuumCostPageMiss = 2;
 int			VacuumCostPageDirty = 20;
 int			VacuumCostLimit = 200;
 double		VacuumCostDelay = 0;
+int			opp_freeze_algo = 0;
 int			target_freeze_duration = 1;
 
 int64		VacuumPageHit = 0;
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2c40e420e53..4e6825a7afd 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2488,6 +2488,16 @@ struct config_int ConfigureNamesInt[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"opp_freeze_algo", PGC_USERSET, AUTOVACUUM,
+			gettext_noop("algorithm used to determine whether or not to freeze a page during vacuum"),
+			NULL
+		},
+		&opp_freeze_algo,
+		0, 0, 10000,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"target_freeze_duration", PGC_USERSET, AUTOVACUUM,
 			gettext_noop("minimum amount of time in seconds that a page should stay frozen."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 9e8e0c1a9db..cd6684a2912 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -669,6 +669,7 @@
 #autovacuum_vacuum_cost_limit = -1	# default vacuum cost limit for
 					# autovacuum, -1 means use
 					# vacuum_cost_limit
+#opp_freeze_algo = 0 # default opp_freeze_algo is 0 which means master
 #target_freeze_duration = 1 # desired time for page to stay frozen in seconds
 
 
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index b76c2015376..406ea16a4f1 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -279,6 +279,9 @@ extern PGDLLIMPORT int VacuumCostPageMiss;
 extern PGDLLIMPORT int VacuumCostPageDirty;
 extern PGDLLIMPORT int VacuumCostLimit;
 extern PGDLLIMPORT double VacuumCostDelay;
+
+/* opp_freeze_algo is only used for development */
+extern PGDLLIMPORT int opp_freeze_algo;
 extern PGDLLIMPORT int target_freeze_duration;
 
 extern PGDLLIMPORT int64 VacuumPageHit;
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index a1945e6a5e8..6f959d9cf74 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -740,7 +740,7 @@ extern void pgstat_report_analyze(Relation rel,
 								  PgStat_Counter livetuples, PgStat_Counter deadtuples,
 								  bool resetcounter);
 
-extern void pgstat_refresh_frz_dur(Oid tableoid, bool shared);
+extern XLogRecPtr pgstat_refresh_frz_stats(Oid tableoid, bool shared);
 
 extern void pgstat_count_vm_unset(Relation relation, XLogRecPtr page_lsn,
 								  XLogRecPtr current_lsn, uint8 old_vmbits);
-- 
2.37.2

From 9881f84e67e6e5009ec959da2359e94211c5732d Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Fri, 8 Dec 2023 16:30:59 -0500
Subject: [PATCH v3 10/12] Track VM sets by vacuum

Earlier commits adding tracking for when the all visible and all frozen
bits were cleared in the visibility map. Add tracking for when they are
set by vacuum so that we can observe how many frozen pages are unfrozen.
---
 src/backend/access/heap/vacuumlazy.c         | 71 ++++++++++++++++++--
 src/backend/utils/activity/pgstat_relation.c | 26 +++++++
 src/include/pgstat.h                         | 19 ++++++
 src/tools/pgindent/typedefs.list             |  1 +
 4 files changed, 111 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 6bd64b15996..c37d57d154c 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -211,6 +211,9 @@ typedef struct LVRelState
 	int64		recently_dead_tuples;	/* # dead, but not yet removable */
 	int64		missed_dead_tuples; /* # removable, but not removed */
 
+	/* VM updates by vacuum. Used by stats */
+	PgStat_VMSet vmsets;
+
 	/*
 	 * The youngest page we predict will stay unmodified for
 	 * target_freeze_duration. We will not opportunistically freeze pages
@@ -501,6 +504,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 							vacrel->relname)));
 	}
 
+	vacrel->vmsets = (PgStat_VMSet) {0};
 	vacrel->frz_threshold_min = pgstat_refresh_frz_stats(RelationGetRelid(rel),
 														 rel->rd_rel->relisshared);
 
@@ -613,6 +617,11 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 						 Max(vacrel->new_live_tuples, 0),
 						 vacrel->recently_dead_tuples +
 						 vacrel->missed_dead_tuples);
+
+	pgstat_report_heap_vacfrz(RelationGetRelid(rel),
+							  rel->rd_rel->relisshared,
+							  &vacrel->vmsets);
+
 	pgstat_progress_end_command();
 
 	if (instrument)
@@ -1114,6 +1123,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		if (!all_visible_according_to_vm && prunestate.all_visible)
 		{
 			uint8		flags = VISIBILITYMAP_ALL_VISIBLE;
+			uint8		previous_flags;
 
 			if (prunestate.all_frozen)
 			{
@@ -1136,9 +1146,23 @@ lazy_scan_heap(LVRelState *vacrel)
 			 */
 			PageSetAllVisible(page);
 			MarkBufferDirty(buf);
-			visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-							  vmbuffer, prunestate.visibility_cutoff_xid,
-							  flags);
+			previous_flags = visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
+											   vmbuffer, prunestate.visibility_cutoff_xid,
+											   flags);
+
+			/*
+			 * If the page isn't empty after vacuuming and we newly marked it
+			 * all visible and all frozen, record that in stats.
+			 */
+			if (!PageIsEmpty(page))
+			{
+				if (!(previous_flags & VISIBILITYMAP_ALL_VISIBLE))
+					vacrel->vmsets.vis++;
+
+				if (prunestate.all_frozen &&
+					!(previous_flags & VISIBILITYMAP_ALL_FROZEN))
+					vacrel->vmsets.vm_freezes++;
+			}
 		}
 
 		/*
@@ -1190,6 +1214,8 @@ lazy_scan_heap(LVRelState *vacrel)
 				 prunestate.all_frozen &&
 				 !VM_ALL_FROZEN(vacrel->rel, blkno, &vmbuffer))
 		{
+			uint8		previous_flags;
+
 			/*
 			 * Avoid relying on all_visible_according_to_vm as a proxy for the
 			 * page-level PD_ALL_VISIBLE bit being set, since it might have
@@ -1209,10 +1235,21 @@ lazy_scan_heap(LVRelState *vacrel)
 			 * safe for REDO was logged when the page's tuples were frozen.
 			 */
 			Assert(!TransactionIdIsValid(prunestate.visibility_cutoff_xid));
-			visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
+			previous_flags = visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
 							  vmbuffer, InvalidTransactionId,
 							  VISIBILITYMAP_ALL_VISIBLE |
 							  VISIBILITYMAP_ALL_FROZEN);
+
+			/*
+			 * If the page isn't empty after vacuuming and we newly marked it
+			 * all visible and all frozen, record that in stats.
+			 */
+			if (!PageIsEmpty(page))
+			{
+				vacrel->vmsets.vm_freezes++;
+				if (!(previous_flags & VISIBILITYMAP_ALL_VISIBLE))
+					vacrel->vmsets.vis++;
+			}
 		}
 
 		/*
@@ -1834,6 +1871,10 @@ lazy_scan_prune(LVRelState *vacrel,
 			TransactionId snapshotConflictHorizon;
 
 			vacrel->frozen_pages++;
+			fpi_before = pgWalUsage.wal_fpi;
+
+			if (!PageIsEmpty(page))
+				vacrel->vmsets.page_freezes++;
 
 			/*
 			 * We can use visibility_cutoff_xid as our cutoff for conflicts
@@ -1858,6 +1899,9 @@ lazy_scan_prune(LVRelState *vacrel,
 			heap_freeze_execute_prepared(vacrel->rel, buf,
 										 snapshotConflictHorizon,
 										 frozen, tuples_frozen);
+
+			if (pgWalUsage.wal_fpi > fpi_before)
+				vacrel->vmsets.freeze_fpis++;
 		}
 	}
 	else
@@ -2610,6 +2654,7 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 	if (heap_page_is_all_visible(vacrel, buffer, &visibility_cutoff_xid,
 								 &all_frozen))
 	{
+		uint8		previous_flags;
 		uint8		flags = VISIBILITYMAP_ALL_VISIBLE;
 
 		if (all_frozen)
@@ -2619,8 +2664,22 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 		}
 
 		PageSetAllVisible(page);
-		visibilitymap_set(vacrel->rel, blkno, buffer, InvalidXLogRecPtr,
-						  vmbuffer, visibility_cutoff_xid, flags);
+		previous_flags = visibilitymap_set(vacrel->rel, blkno, buffer, InvalidXLogRecPtr,
+										   vmbuffer, visibility_cutoff_xid, flags);
+
+		/*
+		 * If the page isn't empty after vacuuming and we newly marked it all
+		 * visible and all frozen, record that in stats.
+		 */
+		if (!PageIsEmpty(page))
+		{
+			if (!(previous_flags & VISIBILITYMAP_ALL_VISIBLE))
+				vacrel->vmsets.vis++;
+
+			if (all_frozen &&
+				!(previous_flags & VISIBILITYMAP_ALL_FROZEN))
+				vacrel->vmsets.vm_freezes++;
+		}
 	}
 
 	/* Revert to the previous phase information for error traceback */
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index ef137466932..1606c3e2c26 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -485,6 +485,32 @@ pgstat_report_vacuum(Oid tableoid, bool shared,
 	pgstat_flush_io(false);
 }
 
+void
+pgstat_report_heap_vacfrz(Oid tableoid, bool shared, PgStat_VMSet *vmsets)
+{
+	PgStat_EntryRef *entry_ref;
+	PgStat_StatTabEntry *tabentry;
+	Oid			dboid = (shared ? InvalidOid : MyDatabaseId);
+
+	if (!pgstat_track_counts)
+		return;
+
+
+	/* block acquiring lock for the same reason as pgstat_report_autovac() */
+	entry_ref = pgstat_get_entry_ref_locked(PGSTAT_KIND_RELATION,
+											dboid, tableoid, false);
+
+	tabentry = &((PgStatShared_Relation *) entry_ref->shared_stats)->stats;
+
+	tabentry->vm_set.vis += vmsets->vis;
+	tabentry->vm_set.page_freezes += vmsets->page_freezes;
+	tabentry->vm_set.vm_freezes += vmsets->vm_freezes;
+	tabentry->vm_set.freeze_fpis += vmsets->freeze_fpis;
+
+	pgstat_unlock_entry(entry_ref);
+}
+
+
 /*
  * Report that the table was just analyzed and flush IO statistics.
  *
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 6f959d9cf74..e035fcc53da 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -501,6 +501,21 @@ typedef struct PgStat_StatSubEntry
 	TimestampTz stat_reset_timestamp;
 } PgStat_StatSubEntry;
 
+
+typedef struct PgStat_VMSet
+{
+	/* number of pages set all visible in the VM */
+	int64		vis;
+	/* number of pages newly marked frozen in the visibility map by vacuum */
+	int64		vm_freezes;
+	/* Number of pages with newly frozen tuples */
+	int64		page_freezes;
+	/* number of freeze records emitted by vacuum containing FPIs */
+	int64		freeze_fpis;
+} PgStat_VMSet;
+
+
+
 typedef struct PgStat_StatTabEntry
 {
 	PgStat_Counter numscans;
@@ -536,6 +551,8 @@ typedef struct PgStat_StatTabEntry
 	XLogRecPtr	target_frz_dur_lsns;
 	/* updated upon VM unset */
 	PgStat_VMUnset vm_unset;
+	/* updated during vacuum and used in stats */
+	PgStat_VMSet vm_set;
 } PgStat_StatTabEntry;
 
 /*
@@ -742,6 +759,8 @@ extern void pgstat_report_analyze(Relation rel,
 
 extern XLogRecPtr pgstat_refresh_frz_stats(Oid tableoid, bool shared);
 
+extern void pgstat_report_heap_vacfrz(Oid tableoid, bool shared, PgStat_VMSet *vmsets);
+
 extern void pgstat_count_vm_unset(Relation relation, XLogRecPtr page_lsn,
 								  XLogRecPtr current_lsn, uint8 old_vmbits);
 
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 890a091d426..1f1e89a7f3e 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2091,6 +2091,7 @@ PgStat_SubXactStatus
 PgStat_TableCounts
 PgStat_TableStatus
 PgStat_TableXactStatus
+PgStat_VMSet
 PgStat_VMUnset
 PgStat_WalStats
 PgXmlErrorContext
-- 
2.37.2

From 9c9492bb495d05ac4cb9d86b08b2644cc9f02832 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Tue, 5 Dec 2023 07:04:57 -0500
Subject: [PATCH v3 12/12] Add pg_visibility_map_summary_extended()

Add a new pg_visibility function, pg_visibility_map_summary_extended(),
which returns the total number of blocks in the relation in addition to
the number of all frozen and all visible pages returned by
pg_visibility_map_summary(). It is easy and cheap to get the number of
blocks at the same time as the number of all visible and all frozen
pages and doing so is useful for calculating percent visible/frozen. The
old pg_visibility_map_summary() function API is left as is but is
implemented by selecting a subset of the columns returned by
pg_visibility_map_summary_extended().
---
 contrib/pg_visibility/expected/pg_visibility.out |  5 +++++
 contrib/pg_visibility/pg_visibility--1.1.sql     | 11 +++++++++--
 contrib/pg_visibility/pg_visibility.c            | 13 +++++++------
 doc/src/sgml/pgvisibility.sgml                   | 12 ++++++++++++
 4 files changed, 33 insertions(+), 8 deletions(-)

diff --git a/contrib/pg_visibility/expected/pg_visibility.out b/contrib/pg_visibility/expected/pg_visibility.out
index 9de54db2a29..db52d407393 100644
--- a/contrib/pg_visibility/expected/pg_visibility.out
+++ b/contrib/pg_visibility/expected/pg_visibility.out
@@ -49,6 +49,7 @@ DETAIL:  This operation is not supported for partitioned tables.
 select pg_visibility_map_summary('test_partitioned');
 ERROR:  relation "test_partitioned" is of wrong relation kind
 DETAIL:  This operation is not supported for partitioned tables.
+CONTEXT:  SQL function "pg_visibility_map_summary" statement 1
 select pg_check_frozen('test_partitioned');
 ERROR:  relation "test_partitioned" is of wrong relation kind
 DETAIL:  This operation is not supported for partitioned tables.
@@ -67,6 +68,7 @@ DETAIL:  This operation is not supported for indexes.
 select pg_visibility_map_summary('test_index');
 ERROR:  relation "test_index" is of wrong relation kind
 DETAIL:  This operation is not supported for indexes.
+CONTEXT:  SQL function "pg_visibility_map_summary" statement 1
 select pg_check_frozen('test_index');
 ERROR:  relation "test_index" is of wrong relation kind
 DETAIL:  This operation is not supported for indexes.
@@ -84,6 +86,7 @@ DETAIL:  This operation is not supported for views.
 select pg_visibility_map_summary('test_view');
 ERROR:  relation "test_view" is of wrong relation kind
 DETAIL:  This operation is not supported for views.
+CONTEXT:  SQL function "pg_visibility_map_summary" statement 1
 select pg_check_frozen('test_view');
 ERROR:  relation "test_view" is of wrong relation kind
 DETAIL:  This operation is not supported for views.
@@ -101,6 +104,7 @@ DETAIL:  This operation is not supported for sequences.
 select pg_visibility_map_summary('test_sequence');
 ERROR:  relation "test_sequence" is of wrong relation kind
 DETAIL:  This operation is not supported for sequences.
+CONTEXT:  SQL function "pg_visibility_map_summary" statement 1
 select pg_check_frozen('test_sequence');
 ERROR:  relation "test_sequence" is of wrong relation kind
 DETAIL:  This operation is not supported for sequences.
@@ -120,6 +124,7 @@ DETAIL:  This operation is not supported for foreign tables.
 select pg_visibility_map_summary('test_foreign_table');
 ERROR:  relation "test_foreign_table" is of wrong relation kind
 DETAIL:  This operation is not supported for foreign tables.
+CONTEXT:  SQL function "pg_visibility_map_summary" statement 1
 select pg_check_frozen('test_foreign_table');
 ERROR:  relation "test_foreign_table" is of wrong relation kind
 DETAIL:  This operation is not supported for foreign tables.
diff --git a/contrib/pg_visibility/pg_visibility--1.1.sql b/contrib/pg_visibility/pg_visibility--1.1.sql
index 0a29967ee6d..c2f81377363 100644
--- a/contrib/pg_visibility/pg_visibility--1.1.sql
+++ b/contrib/pg_visibility/pg_visibility--1.1.sql
@@ -37,12 +37,19 @@ RETURNS SETOF record
 AS 'MODULE_PATHNAME', 'pg_visibility_rel'
 LANGUAGE C STRICT;
 
+-- Show summary of visibility map bits for a relation and the number of blocks
+CREATE FUNCTION pg_visibility_map_summary_extended(regclass,
+    OUT all_visible bigint, OUT all_frozen bigint, OUT nblocks bigint)
+RETURNS record
+AS 'MODULE_PATHNAME', 'pg_visibility_map_summary_extended'
+LANGUAGE C STRICT;
+
 -- Show summary of visibility map bits for a relation.
 CREATE FUNCTION pg_visibility_map_summary(regclass,
     OUT all_visible bigint, OUT all_frozen bigint)
 RETURNS record
-AS 'MODULE_PATHNAME', 'pg_visibility_map_summary'
-LANGUAGE C STRICT;
+AS $$ SELECT all_visible, all_frozen FROM pg_visibility_map_summary_extended($1) $$
+LANGUAGE SQL;
 
 -- Show tupleids of non-frozen tuples if any in all_frozen pages
 -- for a relation.
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 2a4acfd1eee..48c30b222a6 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -44,7 +44,7 @@ PG_FUNCTION_INFO_V1(pg_visibility_map);
 PG_FUNCTION_INFO_V1(pg_visibility_map_rel);
 PG_FUNCTION_INFO_V1(pg_visibility);
 PG_FUNCTION_INFO_V1(pg_visibility_rel);
-PG_FUNCTION_INFO_V1(pg_visibility_map_summary);
+PG_FUNCTION_INFO_V1(pg_visibility_map_summary_extended);
 PG_FUNCTION_INFO_V1(pg_check_frozen);
 PG_FUNCTION_INFO_V1(pg_check_visible);
 PG_FUNCTION_INFO_V1(pg_truncate_visibility_map);
@@ -247,11 +247,11 @@ pg_visibility_rel(PG_FUNCTION_ARGS)
 }
 
 /*
- * Count the number of all-visible and all-frozen pages in the visibility
- * map for a particular relation.
+ * Count the number of all-visible and all-frozen pages in the visibility map
+ * as well as the total number of blocks of a particular relation.
  */
 Datum
-pg_visibility_map_summary(PG_FUNCTION_ARGS)
+pg_visibility_map_summary_extended(PG_FUNCTION_ARGS)
 {
 	Oid			relid = PG_GETARG_OID(0);
 	Relation	rel;
@@ -261,8 +261,8 @@ pg_visibility_map_summary(PG_FUNCTION_ARGS)
 	int64		all_visible = 0;
 	int64		all_frozen = 0;
 	TupleDesc	tupdesc;
-	Datum		values[2];
-	bool		nulls[2] = {0};
+	Datum		values[3];
+	bool		nulls[3] = {0};
 
 	rel = relation_open(relid, AccessShareLock);
 
@@ -296,6 +296,7 @@ pg_visibility_map_summary(PG_FUNCTION_ARGS)
 
 	values[0] = Int64GetDatum(all_visible);
 	values[1] = Int64GetDatum(all_frozen);
+	values[2] = Int64GetDatum(nblocks);
 
 	PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
 }
diff --git a/doc/src/sgml/pgvisibility.sgml b/doc/src/sgml/pgvisibility.sgml
index 097f7e05661..4358e267b18 100644
--- a/doc/src/sgml/pgvisibility.sgml
+++ b/doc/src/sgml/pgvisibility.sgml
@@ -99,6 +99,18 @@
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term><function>pg_visibility_map_summary_extended(relation regclass, all_visible OUT bigint, all_frozen OUT bigint, nblocks OUT bigint) returns record</function></term>
+
+    <listitem>
+     <para>
+      Returns the number of all-visible pages, the number of all-frozen pages,
+      and the total number of blocks in the relation according to the
+      visibility map.
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term><function>pg_check_frozen(relation regclass, t_ctid OUT tid) returns setof tid</function></term>
 
-- 
2.37.2

From 57e1dbaaf42dd4df0bd8a05e633e443e26f1c086 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Fri, 8 Dec 2023 15:05:59 -0500
Subject: [PATCH v3 11/12] Add VM set and unset stats to pg_stat_all_tables

Add columns to pg_stat_all_tables with the number of times pages have
been set and unset all visible and all frozen in the visibility map.
---
 doc/src/sgml/monitoring.sgml         | 71 ++++++++++++++++++++
 src/backend/catalog/system_views.sql |  8 +++
 src/backend/utils/adt/pgstatfuncs.c  | 97 ++++++++++++++++++++++++++++
 src/include/catalog/pg_proc.dat      | 42 ++++++++++++
 src/test/regress/expected/rules.out  | 24 +++++++
 5 files changed, 242 insertions(+)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b804eb8b5ef..72071541b36 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3891,6 +3891,77 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </para></entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>vm_all_vis</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of times pages of this table were set all visible in the
+       visibility map by vacuum.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>vm_un_all_vis</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of times pages of this table marked all visible in the visibility
+       map were modified and the all visible bit unset.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>vm_freezes</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of times pages of this table were set all frozen in the
+       visibility map by vacuum.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>page_freezes</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of times pages of this table had tuples newly frozen by vacuum.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>vm_unfreezes</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of times pages of this table marked all frozen in the visibility
+       map were modified and the all frozen bit unset.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>missed_freezes</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of times pages of this table which had been set all visible in
+       the visibility map by vacuum remained unmodified for longer than target
+       page freeze duration.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>early_unfreezes</structfield> <type>bigint</type>
+      </para>
+      <para>
+       Number of times frozen pages of this table were modified and the frozen
+       bit unset in the visibility map before target page freeze duration had
+       elapsed.
+      </para></entry>
+     </row>
+
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
        <structfield>analyze_count</structfield> <type>bigint</type>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 058fc47c919..13776b52f4b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -696,6 +696,14 @@ CREATE VIEW pg_stat_all_tables AS
             pg_stat_get_last_analyze_time(C.oid) as last_analyze,
             pg_stat_get_last_autoanalyze_time(C.oid) as last_autoanalyze,
             pg_stat_get_vacuum_count(C.oid) AS vacuum_count,
+            pg_stat_get_vm_vis(C.oid) AS vm_all_vis,
+            pg_stat_get_vm_unvis(C.oid) AS vm_un_all_vis,
+            pg_stat_get_vm_freezes(C.oid) AS vm_freezes,
+            pg_stat_get_page_freezes(C.oid) AS page_freezes,
+            pg_stat_get_vm_unfreezes(C.oid) AS vm_unfreezes,
+            pg_stat_get_missed_freezes(C.oid) AS missed_freezes,
+            pg_stat_get_early_unfreezes(C.oid) AS early_unfreezes,
+            pg_stat_get_freeze_fpis(C.oid) AS freeze_fpis,
             pg_stat_get_autovacuum_count(C.oid) AS autovacuum_count,
             pg_stat_get_analyze_count(C.oid) AS analyze_count,
             pg_stat_get_autoanalyze_count(C.oid) AS autoanalyze_count
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index e65cbf41e9f..9ca2ef49071 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -32,6 +32,7 @@
 #include "utils/acl.h"
 #include "utils/builtins.h"
 #include "utils/inet.h"
+#include "utils/pg_lsn.h"
 #include "utils/timestamp.h"
 
 #define UINT32_ACCESS_ONCE(var)		 ((uint32)(*((volatile uint32 *)&(var))))
@@ -108,6 +109,102 @@ PG_STAT_GET_RELENTRY_INT64(tuples_updated)
 /* pg_stat_get_vacuum_count */
 PG_STAT_GET_RELENTRY_INT64(vacuum_count)
 
+Datum
+pg_stat_get_vm_vis(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	PgStat_StatTabEntry *tabentry;
+
+	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+		PG_RETURN_NULL();
+
+	PG_RETURN_INT64(tabentry->vm_set.vis);
+}
+
+Datum
+pg_stat_get_vm_unvis(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	PgStat_StatTabEntry *tabentry;
+
+	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+		PG_RETURN_NULL();
+
+	PG_RETURN_INT64(tabentry->vm_unset.unvis);
+}
+
+Datum
+pg_stat_get_vm_freezes(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	PgStat_StatTabEntry *tabentry;
+
+	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+		PG_RETURN_NULL();
+
+	PG_RETURN_INT64(tabentry->vm_set.vm_freezes);
+}
+
+Datum
+pg_stat_get_page_freezes(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	PgStat_StatTabEntry *tabentry;
+
+	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+		PG_RETURN_NULL();
+
+	PG_RETURN_INT64(tabentry->vm_set.page_freezes);
+}
+
+Datum
+pg_stat_get_vm_unfreezes(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	PgStat_StatTabEntry *tabentry;
+
+	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+		PG_RETURN_NULL();
+
+	PG_RETURN_INT64(tabentry->vm_unset.vm_unfreezes);
+}
+
+Datum
+pg_stat_get_missed_freezes(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	PgStat_StatTabEntry *tabentry;
+
+	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+		PG_RETURN_NULL();
+
+	PG_RETURN_INT64(tabentry->vm_unset.missed_freezes);
+}
+
+Datum
+pg_stat_get_early_unfreezes(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	PgStat_StatTabEntry *tabentry;
+
+	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+		PG_RETURN_NULL();
+
+	PG_RETURN_INT64(tabentry->vm_unset.early_unfreezes);
+}
+
+Datum
+pg_stat_get_freeze_fpis(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	PgStat_StatTabEntry *tabentry;
+
+	if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
+		PG_RETURN_NULL();
+
+	PG_RETURN_INT64(tabentry->vm_set.freeze_fpis);
+}
+
 #define PG_STAT_GET_RELENTRY_TIMESTAMPTZ(stat)					\
 Datum															\
 CppConcat(pg_stat_get_,stat)(PG_FUNCTION_ARGS)					\
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 5b67784731a..32ea764cdc0 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5419,6 +5419,48 @@
   proname => 'pg_stat_get_vacuum_count', provolatile => 's', proparallel => 'r',
   prorettype => 'int8', proargtypes => 'oid',
   prosrc => 'pg_stat_get_vacuum_count' },
+
+{ oid => '9990', descr => 'statistics: number of times pages of this relation were set all visible by vacuum',
+  proname => 'pg_stat_get_vm_vis', provolatile => 's', proparallel => 'r',
+  prorettype => 'int8', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_vm_vis' },
+
+{ oid => '9991', descr => 'statistics: number of times all visible pages of this relation were modified',
+  proname => 'pg_stat_get_vm_unvis', provolatile => 's', proparallel => 'r',
+  prorettype => 'int8', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_vm_unvis' },
+
+{ oid => '9992', descr => 'statistics: number of times pages of this relation were set all frozen in the VM',
+  proname => 'pg_stat_get_vm_freezes', provolatile => 's', proparallel => 'r',
+  prorettype => 'int8', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_vm_freezes' },
+
+{ oid => '9993', descr => 'statistics: number of times pages of this relation had tuples newly frozen by a vacuum',
+  proname => 'pg_stat_get_page_freezes', provolatile => 's', proparallel => 'r',
+  prorettype => 'int8', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_page_freezes' },
+
+{ oid => '9995', descr => 'statistics: number of times all frozen pages pages of this relation were modified',
+  proname => 'pg_stat_get_vm_unfreezes', provolatile => 's', proparallel => 'r',
+  prorettype => 'int8', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_vm_unfreezes' },
+
+{ oid => '9996', descr => 'statistics: number of missed opportunities to freeze pages of this relation',
+  proname => 'pg_stat_get_missed_freezes', provolatile => 's', proparallel => 'r',
+  prorettype => 'int8', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_missed_freezes' },
+
+{ oid => '9997', descr => 'statistics: number of pages of this relation were mistakenly frozen',
+  proname => 'pg_stat_get_early_unfreezes', provolatile => 's', proparallel => 'r',
+  prorettype => 'int8', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_early_unfreezes' },
+
+{ oid => '9998', descr => 'statistics: number of freeze WAL records emitted containing FPIs',
+  proname => 'pg_stat_get_freeze_fpis', provolatile => 's', proparallel => 'r',
+  prorettype => 'int8', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_freeze_fpis' },
+
+
 { oid => '3055', descr => 'statistics: number of auto vacuums for a table',
   proname => 'pg_stat_get_autovacuum_count', provolatile => 's',
   proparallel => 'r', prorettype => 'int8', proargtypes => 'oid',
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index f645e8486bf..70f42008ff4 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1800,6 +1800,14 @@ pg_stat_all_tables| SELECT c.oid AS relid,
     pg_stat_get_last_analyze_time(c.oid) AS last_analyze,
     pg_stat_get_last_autoanalyze_time(c.oid) AS last_autoanalyze,
     pg_stat_get_vacuum_count(c.oid) AS vacuum_count,
+    pg_stat_get_vm_vis(c.oid) AS vm_all_vis,
+    pg_stat_get_vm_unvis(c.oid) AS vm_un_all_vis,
+    pg_stat_get_vm_freezes(c.oid) AS vm_freezes,
+    pg_stat_get_page_freezes(c.oid) AS page_freezes,
+    pg_stat_get_vm_unfreezes(c.oid) AS vm_unfreezes,
+    pg_stat_get_missed_freezes(c.oid) AS missed_freezes,
+    pg_stat_get_early_unfreezes(c.oid) AS early_unfreezes,
+    pg_stat_get_freeze_fpis(c.oid) AS freeze_fpis,
     pg_stat_get_autovacuum_count(c.oid) AS autovacuum_count,
     pg_stat_get_analyze_count(c.oid) AS analyze_count,
     pg_stat_get_autoanalyze_count(c.oid) AS autoanalyze_count
@@ -2172,6 +2180,14 @@ pg_stat_sys_tables| SELECT relid,
     last_analyze,
     last_autoanalyze,
     vacuum_count,
+    vm_all_vis,
+    vm_un_all_vis,
+    vm_freezes,
+    page_freezes,
+    vm_unfreezes,
+    missed_freezes,
+    early_unfreezes,
+    freeze_fpis,
     autovacuum_count,
     analyze_count,
     autoanalyze_count
@@ -2220,6 +2236,14 @@ pg_stat_user_tables| SELECT relid,
     last_analyze,
     last_autoanalyze,
     vacuum_count,
+    vm_all_vis,
+    vm_un_all_vis,
+    vm_freezes,
+    page_freezes,
+    vm_unfreezes,
+    missed_freezes,
+    early_unfreezes,
+    freeze_fpis,
     autovacuum_count,
     analyze_count,
     autoanalyze_count
-- 
2.37.2

Reply via email to