Thanks for the review! v6 attached.

On Sat, Jul 6, 2024 at 1:36 PM Andrey M. Borodin <x4...@yandex-team.ru> wrote:
>
> PgStartLSN = GetXLogInsertRecPtr();
> Should this be kind of RecoveryEndPtr? How is it expected to behave on 
> Standby in HA cluster, which was doing a crash recovery of 1y WALs in a day, 
> then is in startup for a year as a Hot Standby, and then is promoted?

So, I don't think we will allow use of the LSNTimeStream on a standby,
since it is unclear what it would mean on a standby. For example, do
you want to know the time the LSN was generated or the time it was
replayed? Note that bgwriter won't insert values to the time stream on
a standby (it explicitly checks).

But, you bring up an issue that I don't quite know what to do about.
If the standby doesn't have an LSNTimeStream, then when it is
promoted, LSN <-> time conversions of LSNs and times before the
promotion seem impossible. Maybe if the stats file is getting written
out at checkpoints, we could restore from that previous primary's file
after promotion?

This brings up a second issue, which is that, currently, bgwriter
won't insert into the time stream when wal_level is minimal. I'm not
sure exactly how to resolve it because I am relying on the "last
important rec pointer" and the LOG_SNAPSHOT_INTERVAL_MS to throttle
when the bgwriter actually inserts new records into the LSNTimeStream.
I could come up with a different timeout interval for updating the
time stream. Perhaps I should do that?

> lsn_ts_calculate_error_area() is prone to overflow. Even int64 does not seem 
> capable to accommodate LSN*time. And the function may return negative result, 
> despite claiming area as a result. It’s intended, but a little misleading.

Ah, great point. I've fixed this.

> i-- > 0
> Is there a point to do a backward count in the loop?
> Consider dropping not one by one, but half of a stream, LSNTimeStream is ~2Kb 
> of cache and it’s loaded as a whole to the cache..

Yes, the backwards looping was super confusing. It was a relic of my
old design. Even without your point about cache locality, the code is
much harder to understand with the backwards looping. I've changed the
array to fill forwards and be accessed with forward loops.

> > On 27 Jun 2024, at 07:18, Melanie Plageman <melanieplage...@gmail.com> 
> > wrote:
> >
> >> 2. Some benchmarks to proof the patch does not have CPU footprint.
> >
> > This is still a todo. Typically when designing a benchmark like this,
> > I would want to pick a worst-case workload to see how bad it could be.
> > I wonder if just a write heavy workload like pgbench builtin tpcb-like
> > would be sufficient?
>
> Increasing background writer activity to maximum and not seeing LSNTimeStream 
> function in `perf top` seems enough to me.

I've got this on my TODO.

> >> Tests fail on Windows.
> >
> > I think this was because of the compiler warnings, but I need to
> > double-check now.
> Nope, it really looks more serious.
> [12:31:25.701] ../src/backend/utils/activity/pgstat_wal.c(433): error C2375: 
> 'pg_estimate_lsn_at_time': redefinition; different linkage

Ah, yes. I mistakenly added the functions to pg_proc.dat and also
called PG_FUNCTION_INFO_V1 for the functions. I've fixed it.

> >> The patch lacks tests.
> >
> > I thought about this a bit. I wonder what kind of tests make sense.
> >
> > I could
> > 1) Add tests with the existing stats tests
> > (src/test/regress/sql/stats.sql) and just test that bgwriter is in
> > fact adding to the time stream.
> >
> > 2) Or should I add some infrastructure to be able to create an
> > LSNTimeStream and then insert values to it and do some validations of
> > what is added? I did a version of this but it is just much more
> > annoying with C & SQL than with python (where I tried out my
> > algorithms) [2].
>
> I think just a test which calls functions and discards the result would 
> greatly increase coverage.

I've added tests of the two main conversion functions. I didn't add a
test of the function which gets the whole stream (pg_lsntime_stream())
because I don't think I can guarantee it won't be empty -- so I'm not
sure what I could do with it in a test.

> > On 29 Jun 2024, at 03:09, Melanie Plageman <melanieplage...@gmail.com> 
> > wrote:
> > change the user-facing functions for estimating an
> > LSN/time conversion to instead return a floor and a ceiling -- instead
> > of linearly interpolating a guess. This would be a way to keep users
> > from misunderstanding the accuracy of the functions to translate LSN
> > <-> time.
>
> I think this is a good idea. And it covers well “server restart problem”. If 
> API just returns -inf as a boundary, caller can correctly interpret this 
> situation.

Providing "ceiling" and "floor" user functions is still a TODO for me,
however, I think that the patch mostly does handle server restarts.

In the event of a restart, the cumulative stats system will have
persisted our time stream, so the LSNTimeStream will just be read back
in with the rest of the stats. I've added logic to ensure that if the
PgStartLSN is newer than our oldest LSNTimeStream entry, we use the
oldest entry instead of PgStartLSN when doing conversions LSN <->
time.

As for a crash, stats do not persist crashes, but I think Michael's
patch will go in to write out the stats file at checkpoints, and then
this should be good enough.

Is there anything else you think that is an issue with restarts?

> Thanks! Looking forward to more timely freezing.

Thanks! I'll be posting a new version of the opportunistic freezing
patch that uses the time stream quite soon, so I hope you'll take a
look at that as well!

- Melanie
From 49f767bc859356df8b4a6ea03d490b6b1aa1d48d Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Wed, 21 Feb 2024 20:06:29 -0500
Subject: [PATCH v6 5/6] Add time <-> LSN translation functions

Previous commits added a global LSNTimeStream, maintained by background
writer, that allows approximate translations between time and LSNs.

Add SQL-callable functions to convert from LSN to time and back and a
SQL-callable function returning the entire LSNTimeStream.

This could be useful in combination with SQL-callable functions
accessing a page LSN to approximate the time of last modification of a
page or estimating the LSN consumption rate to moderate maintenance
processes and balance system resource utilization.
---
 doc/src/sgml/monitoring.sgml            | 66 +++++++++++++++++++++++++
 src/backend/utils/activity/pgstat_wal.c | 52 +++++++++++++++++++
 src/include/catalog/pg_proc.dat         | 22 +++++++++
 src/test/regress/expected/stats.out     | 13 +++++
 src/test/regress/sql/stats.sql          |  5 ++
 5 files changed, 158 insertions(+)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 55417a6fa9d..f86e77955d6 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3195,6 +3195,72 @@ description | Waiting for a newly initialized WAL file to reach durable storage
    </tgroup>
   </table>
 
+  <para>
+  In addition to these WAL stats, a stream of LSN-time pairs is accessible
+  via the functions shown in <xref linkend="functions-lsn-time-stream"/>.
+  </para>
+
+  <table id="functions-lsn-time-stream">
+   <title>LSN Time Stream Information Functions</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="func_table_entry"><para role="func_signature">
+       Function
+      </para>
+      <para>
+       Description
+      </para></entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="func_table_entry"><para role="func_signature">
+       <indexterm>
+        <primary>pg_estimate_lsn_at_time</primary>
+       </indexterm>
+       <function>pg_estimate_lsn_at_time</function> ( <type>timestamp with time zone</type> )
+       <returnvalue>pg_lsn</returnvalue>
+      </para>
+      <para>
+       Returns the estimated lsn at the provided time.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="func_table_entry"><para role="func_signature">
+       <indexterm>
+        <primary>pg_estimate_lsn_at_time</primary>
+       </indexterm>
+       <function>pg_estimate_lsn_at_time</function> ( <type>pg_lsn</type> )
+       <returnvalue>timestamp with time zone</returnvalue>
+      </para>
+      <para>
+        Returns the estimated time at the provided lsn.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="func_table_entry"><para role="func_signature">
+       <indexterm>
+        <primary>pg_lsntime_stream</primary>
+       </indexterm>
+       <function>pg_lsntime_stream</function> ()
+       <returnvalue>record</returnvalue>
+       ( <parameter>time</parameter> <type>timestamp with time zone</type>,
+       <parameter>lsn</parameter> <type>pg_lsnwith time zone</type>)
+      </para>
+      <para>
+       Returns all of the LSN-time pairs in the current LSN time stream.
+      </para></entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+
+
+
 </sect2>
 
  <sect2 id="monitoring-pg-stat-database-view">
diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c
index c1c3da22b2f..7552a964b80 100644
--- a/src/backend/utils/activity/pgstat_wal.c
+++ b/src/backend/utils/activity/pgstat_wal.c
@@ -19,8 +19,10 @@
 
 #include "access/xlog.h"
 #include "executor/instrument.h"
+#include "funcapi.h"
 #include "math.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
 #include "utils/pgstat_internal.h"
 #include "utils/timestamp.h"
 
@@ -525,3 +527,53 @@ pgstat_wal_update_lsntime_stream(TimestampTz time, XLogRecPtr lsn)
 	lsntime_insert(&stats_shmem->stats.stream, time, lsn);
 	LWLockRelease(&stats_shmem->lock);
 }
+
+Datum
+pg_estimate_time_at_lsn(PG_FUNCTION_ARGS)
+{
+	XLogRecPtr	lsn = PG_GETARG_LSN(0);
+	PgStatShared_Wal *stats_shmem = &pgStatLocal.shmem->wal;
+	TimestampTz result;
+
+	LWLockAcquire(&stats_shmem->lock, LW_SHARED);
+	result = estimate_time_at_lsn(&stats_shmem->stats.stream, lsn);
+	LWLockRelease(&stats_shmem->lock);
+
+	PG_RETURN_TIMESTAMPTZ(result);
+}
+
+Datum
+pg_estimate_lsn_at_time(PG_FUNCTION_ARGS)
+{
+	PgStatShared_Wal *stats_shmem = &pgStatLocal.shmem->wal;
+	TimestampTz time = PG_GETARG_TIMESTAMPTZ(0);
+	XLogRecPtr	result;
+
+	LWLockAcquire(&stats_shmem->lock, LW_SHARED);
+	result = estimate_lsn_at_time(&stats_shmem->stats.stream, time);
+	LWLockRelease(&stats_shmem->lock);
+
+	PG_RETURN_LSN(result);
+}
+
+Datum
+pg_lsntime_stream(PG_FUNCTION_ARGS)
+{
+	ReturnSetInfo *rsinfo;
+	PgStat_WalStats *stats = pgstat_fetch_stat_wal();
+	LSNTimeStream *stream = &stats->stream;
+
+	InitMaterializedSRF(fcinfo, 0);
+	rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+	for (int i = 0; i < stream->length; i++)
+	{
+		Datum		values[2] = {0};
+		bool		nulls[2] = {0};
+
+		values[0] = stream->data[i].time;
+		values[1] = stream->data[i].lsn;
+		tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc,
+							 values, nulls);
+	}
+	return (Datum) 0;
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 54b50ee5d61..b5d8d0d3673 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -6375,6 +6375,28 @@
   prorettype => 'timestamptz', proargtypes => 'xid',
   prosrc => 'pg_xact_commit_timestamp' },
 
+{ oid => '9997',
+  descr => 'get approximate LSN at a particular point in time',
+  proname => 'pg_estimate_lsn_at_time', provolatile => 'v',
+  prorettype => 'pg_lsn', proargtypes => 'timestamptz',
+  prosrc => 'pg_estimate_lsn_at_time' },
+
+{ oid => '9996',
+  descr => 'get approximate time at a particular LSN',
+  proname => 'pg_estimate_time_at_lsn', provolatile => 'v',
+  prorettype => 'timestamptz', proargtypes => 'pg_lsn',
+  prosrc => 'pg_estimate_time_at_lsn' },
+
+{ oid => '9994',
+  descr => 'print the LSN Time Stream',
+  proname => 'pg_lsntime_stream', prorows => '64',
+  proretset => 't', provolatile => 'v', proparallel => 's',
+  prorettype => 'record', proargtypes => '',
+  proallargtypes => '{timestamptz,pg_lsn}',
+  proargmodes => '{o,o}',
+  proargnames => '{time, lsn}',
+  prosrc => 'pg_lsntime_stream' },
+
 { oid => '6168',
   descr => 'get commit timestamp and replication origin of a transaction',
   proname => 'pg_xact_commit_timestamp_origin', provolatile => 'v',
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 6e08898b183..b02b74e5872 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -813,6 +813,19 @@ SELECT (n_tup_ins + n_tup_upd) > 0 AS has_data FROM pg_stat_all_tables
 -----
 -- Test that various stats views are being properly populated
 -----
+SELECT pg_estimate_time_at_lsn(pg_current_wal_insert_lsn()) >
+                              now() - make_interval(years=> 100);
+ ?column? 
+----------
+ t
+(1 row)
+
+SELECT pg_estimate_lsn_at_time(now()) - '0/0' > 0;
+ ?column? 
+----------
+ t
+(1 row)
+
 -- Test that sessions is incremented when a new session is started in pg_stat_database
 SELECT sessions AS db_stat_sessions FROM pg_stat_database WHERE datname = (SELECT current_database()) \gset
 \c
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index d8ac0d06f48..8562bdb45e8 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -411,6 +411,11 @@ SELECT (n_tup_ins + n_tup_upd) > 0 AS has_data FROM pg_stat_all_tables
 -- Test that various stats views are being properly populated
 -----
 
+SELECT pg_estimate_time_at_lsn(pg_current_wal_insert_lsn()) >
+                              now() - make_interval(years=> 100);
+
+SELECT pg_estimate_lsn_at_time(now()) - '0/0' > 0;
+
 -- Test that sessions is incremented when a new session is started in pg_stat_database
 SELECT sessions AS db_stat_sessions FROM pg_stat_database WHERE datname = (SELECT current_database()) \gset
 \c
-- 
2.34.1

From 3df61a9f26f33a88409920587707d269f35eccdf Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Tue, 5 Dec 2023 07:29:39 -0500
Subject: [PATCH v6 1/6] Record LSN at postmaster startup

The insert_lsn at postmaster startup can be used along with PgStartTime
as seed values for a timeline mapping LSNs to time. Future commits will
add such a structure for LSN <-> time conversions. A start LSN allows
for such conversions before even inserting a value into the timeline.
The current time and current insert LSN can be used along with
PgStartTime and PgStartLSN.

This is WIP, as I'm not sure if I did this in the right place.
---
 src/backend/access/transam/xlog.c   | 2 ++
 src/backend/postmaster/postmaster.c | 2 ++
 src/include/utils/builtins.h        | 3 +++
 3 files changed, 7 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 4a8a2f6098f..fed41b3f992 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -140,6 +140,8 @@ bool		XLOG_DEBUG = false;
 
 int			wal_segment_size = DEFAULT_XLOG_SEG_SIZE;
 
+XLogRecPtr	PgStartLSN = InvalidXLogRecPtr;
+
 /*
  * Number of WAL insertion locks to use. A higher value allows more insertions
  * to happen concurrently, but adds some CPU overhead to flushing the WAL,
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 02442a4b85a..c637a4229fe 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -117,6 +117,7 @@
 #include "storage/proc.h"
 #include "tcop/backend_startup.h"
 #include "tcop/tcopprot.h"
+#include "utils/builtins.h"
 #include "utils/datetime.h"
 #include "utils/memutils.h"
 #include "utils/pidfile.h"
@@ -1333,6 +1334,7 @@ PostmasterMain(int argc, char *argv[])
 	 * Remember postmaster startup time
 	 */
 	PgStartTime = GetCurrentTimestamp();
+	PgStartLSN = GetXLogInsertRecPtr();
 
 	/*
 	 * Report postmaster status in the postmaster.pid file, to allow pg_ctl to
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 359c570f23e..16a7a058bc7 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -17,6 +17,7 @@
 #include "fmgr.h"
 #include "nodes/nodes.h"
 #include "utils/fmgrprotos.h"
+#include "access/xlogdefs.h"
 
 /* Sign + the most decimal digits an 8-byte number could have */
 #define MAXINT8LEN 20
@@ -85,6 +86,8 @@ extern void generate_operator_clause(fmStringInfo buf,
 									 Oid opoid,
 									 const char *rightop, Oid rightoptype);
 
+extern PGDLLIMPORT XLogRecPtr PgStartLSN;
+
 /* varchar.c */
 extern int	bpchartruelen(char *s, int len);
 
-- 
2.34.1

From 1f55402be2f1e4bd015432a11640cfe72e44957c Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Wed, 21 Feb 2024 20:28:27 -0500
Subject: [PATCH v6 3/6] Add LSNTimeStream to PgStat_WalStats

Add a globally maintained instance of an LSNTimeStream to
PgStat_WalStats and a utility function to insert new values.
---
 src/backend/utils/activity/pgstat_wal.c | 10 ++++++++++
 src/include/pgstat.h                    |  4 ++++
 2 files changed, 14 insertions(+)

diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c
index affab8437c8..c1c3da22b2f 100644
--- a/src/backend/utils/activity/pgstat_wal.c
+++ b/src/backend/utils/activity/pgstat_wal.c
@@ -515,3 +515,13 @@ stop:
 	result = (double) (lsn - start.lsn) / lsns_elapsed * time_elapsed + start.time;
 	return Max(result, 0);
 }
+
+void
+pgstat_wal_update_lsntime_stream(TimestampTz time, XLogRecPtr lsn)
+{
+	PgStatShared_Wal *stats_shmem = &pgStatLocal.shmem->wal;
+
+	LWLockAcquire(&stats_shmem->lock, LW_EXCLUSIVE);
+	lsntime_insert(&stats_shmem->stats.stream, time, lsn);
+	LWLockRelease(&stats_shmem->lock);
+}
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 825cdc8f73a..667f2b93cad 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -470,6 +470,7 @@ typedef struct PgStat_WalStats
 	PgStat_Counter wal_sync;
 	PgStat_Counter wal_write_time;
 	PgStat_Counter wal_sync_time;
+	LSNTimeStream stream;
 	TimestampTz stat_reset_timestamp;
 } PgStat_WalStats;
 
@@ -752,6 +753,9 @@ extern void pgstat_execute_transactional_drops(int ndrops, struct xl_xact_stats_
 extern void pgstat_report_wal(bool force);
 extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
 
+/* Helpers for maintaining the LSNTimeStream */
+extern void pgstat_wal_update_lsntime_stream(TimestampTz time, XLogRecPtr lsn);
+
 
 /*
  * Variables in pgstat.c
-- 
2.34.1

From d6dc1128f75d883332945ab27f98a8c70b83b607 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Wed, 27 Dec 2023 16:40:27 -0500
Subject: [PATCH v6 2/6] Add LSNTimeStream for converting LSN <-> time

Add a new structure, LSNTimeStream, consisting of LSNTimes -- each an
LSN, time pair. The LSNTimeStream is fixed size, so when a new LSNTime
is inserted to a full LSNTimeStream, an LSNTime is dropped and the new
LSNTime is inserted. We drop the LSNTime whose absence would cause the
least error when interpolating between its adjoining points.

LSN <-> time conversions can be done using linear interpolation with two
LSNTimes on the LSNTimeStream.

This commit does not add a global instance of LSNTimeStream. It adds the
structures and functions needed to maintain and access such a stream.
---
 src/backend/utils/activity/pgstat_wal.c | 323 ++++++++++++++++++++++++
 src/include/pgstat.h                    |  32 +++
 src/tools/pgindent/typedefs.list        |   2 +
 3 files changed, 357 insertions(+)

diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c
index e2a3f6b865c..affab8437c8 100644
--- a/src/backend/utils/activity/pgstat_wal.c
+++ b/src/backend/utils/activity/pgstat_wal.c
@@ -17,8 +17,12 @@
 
 #include "postgres.h"
 
+#include "access/xlog.h"
 #include "executor/instrument.h"
+#include "math.h"
+#include "utils/builtins.h"
 #include "utils/pgstat_internal.h"
+#include "utils/timestamp.h"
 
 
 PgStat_PendingWalStats PendingWalStats = {0};
@@ -32,6 +36,11 @@ PgStat_PendingWalStats PendingWalStats = {0};
 static WalUsage prevWalUsage;
 
 
+static void lsntime_insert(LSNTimeStream *stream, TimestampTz time, XLogRecPtr lsn);
+
+XLogRecPtr	estimate_lsn_at_time(const LSNTimeStream *stream, TimestampTz time);
+TimestampTz estimate_time_at_lsn(const LSNTimeStream *stream, XLogRecPtr lsn);
+
 /*
  * Calculate how much WAL usage counters have increased and update
  * shared WAL and IO statistics.
@@ -192,3 +201,317 @@ pgstat_wal_snapshot_cb(void)
 		   sizeof(pgStatLocal.snapshot.wal));
 	LWLockRelease(&stats_shmem->lock);
 }
+
+/*
+ * Given three LSNTimes, calculate the area of the triangle they form were they
+ * plotted with time on the X axis and LSN on the Y axis. An illustration:
+ *
+ *   LSN
+ *    |
+ *    |                                                         * right
+ *    |
+ *    |
+ *    |
+ *    |                                                * mid    * C
+ *    |
+ *    |
+ *    |
+ *    |  * left                                        * B      * A
+ *    |
+ *    +------------------------------------------------------------------
+ *
+ * The area of the triangle with vertices (left, mid, right) is the error
+ * incurred over the interval [left, right] were we to interpolate with just
+ * [left, right] rather than [left, mid) and [mid, right).
+ */
+static float
+lsn_ts_calculate_error_area(LSNTime *left, LSNTime *mid, LSNTime *right)
+{
+	float		left_time = left->time,
+				left_lsn = left->lsn;
+	float		mid_time = mid->time,
+				mid_lsn = mid->lsn;
+	float		right_time = right->time,
+				right_lsn = right->lsn;
+
+	/* Area of the rectangle with opposing corners left and right */
+	float		rectangle_all = (right_time - left_time) * (right_lsn - left_lsn);
+
+	/* Area of the right triangle with vertices left, right, and A */
+	float		triangle1 = rectangle_all / 2;
+
+	/* Area of the right triangle with vertices left, mid, and B */
+	float		triangle2 = (mid_lsn - left_lsn) * (mid_time - left_time) / 2;
+
+	/* Area of the right triangle with vertices mid, right, and C */
+	float		triangle3 = (right_lsn - mid_lsn) * (right_time - mid_time) / 2;
+
+	/* Area of the rectangle with vertices mid, A, B, and C */
+	float		rectangle_part = (right_lsn - mid_lsn) * (mid_time - left_time);
+
+	/* Area of the triangle with vertices left, mid, and right */
+	return triangle1 - triangle2 - triangle3 - rectangle_part;
+}
+
+/*
+ * Determine which LSNTime to drop from a full LSNTimeStream. Once the LSNTime
+ * is dropped, points between it and either of its adjacent LSNTimes will be
+ * interpolated between those two LSNTimes instead. To keep the LSNTimeStream
+ * as accurate as possible, drop the LSNTime whose absence would have the least
+ * impact on future interpolations.
+ *
+ * We determine the error that would be introduced by dropping a point on the
+ * stream by calculating the area of the triangle formed by the LSNTime and its
+ * adjacent LSNTimes. We do this for each LSNTime in the stream (except for the
+ * first and last LSNTimes) and choose the LSNTime with the smallest error
+ * (area). We avoid extrapolation by never dropping the first or last points.
+ */
+static unsigned int
+lsntime_to_drop(LSNTimeStream *stream)
+{
+	double		min_area;
+	unsigned int target_point;
+
+	/* Don't drop points if free space available */
+	Assert(stream->length == LSNTIMESTREAM_VOLUME);
+
+	min_area = lsn_ts_calculate_error_area(&stream->data[0],
+										   &stream->data[1],
+										   &stream->data[2]);
+
+	target_point = 1;
+
+	for (int i = 1; i < stream->length - 1; i++)
+	{
+		LSNTime    *left = &stream->data[i - 1];
+		LSNTime    *mid = &stream->data[i];
+		LSNTime    *right = &stream->data[i + 1];
+		float		area = lsn_ts_calculate_error_area(left, mid, right);
+
+		if (fabs(area) < fabs(min_area))
+		{
+			min_area = area;
+			target_point = i;
+		}
+	}
+
+	return target_point;
+}
+
+/*
+ * Insert a new LSNTime into the LSNTimeStream in the first available element,
+ * or, if there are no empty elements, drop an LSNTime from the stream, move
+ * all the subsequent LSNTimes down and insert the new LSNTime into the tail.
+ */
+void
+lsntime_insert(LSNTimeStream *stream, TimestampTz time,
+			   XLogRecPtr lsn)
+{
+	unsigned int drop;
+	LSNTime		entrant = {.lsn = lsn,.time = time};
+
+	if (stream->length < LSNTIMESTREAM_VOLUME)
+	{
+		/*
+		 * The new entry should exceed the most recent entry to ensure time
+		 * moves forward on the stream.
+		 */
+		Assert(stream->length == 0 ||
+			   (lsn >= stream->data[stream->length - 1].lsn &&
+				time >= stream->data[stream->length - 1].time));
+
+		/*
+		 * If there are unfilled elements in the stream, insert the passed-in
+		 * LSNTime into the current tail of the array.
+		 */
+		stream->data[stream->length++] = entrant;
+		return;
+	}
+
+	drop = lsntime_to_drop(stream);
+
+	/*
+	 * Drop the LSNTime at index drop by copying the array from drop - 1 to
+	 * drop
+	 */
+	memmove(&stream->data[drop],
+			&stream->data[drop + 1],
+			sizeof(LSNTime) * (stream->length - 1 - drop));
+
+	stream->data[stream->length - 1] = entrant;
+}
+
+/*
+ * Translate time to a LSN using the provided stream. The stream will not
+ * be modified.
+ */
+XLogRecPtr
+estimate_lsn_at_time(const LSNTimeStream *stream, TimestampTz time)
+{
+	XLogRecPtr	result;
+	int64		time_elapsed,
+				lsns_elapsed;
+	LSNTime		start = {.time = PgStartTime,.lsn = PgStartLSN};
+	LSNTime		end = {.time = GetCurrentTimestamp(),.lsn = GetXLogInsertRecPtr()};
+
+	/*
+	 * If the database has been restarted, PgStartLSN may be after our oldest
+	 * value. In that case, use the oldest value in the time stream as the
+	 * start.
+	 */
+	if (stream->length > 0 && start.time > stream->data[0].time)
+		start = stream->data[0];
+
+	/*
+	 * If the LSN is before our oldest known LSN, the best we can do is return
+	 * our oldest known time.
+	 */
+	if (time < start.time)
+		return start.lsn;
+
+	/*
+	 * If the provided time is after now, the current LSN is our best
+	 * estimate.
+	 */
+	if (time >= end.time)
+		return end.lsn;
+
+	/*
+	 * Loop through the stream. Stop at the first LSNTime later than our
+	 * target time. This LSNTime will be our interpolation end point. If
+	 * there's an LSNTime earlier than that, that will be our interpolation
+	 * start point.
+	 */
+	for (int i = 0; i < stream->length; i++)
+	{
+		if (stream->data[i].time < time)
+			continue;
+
+		end = stream->data[i];
+		if (i > 0)
+			start = stream->data[i - 1];
+		goto stop;
+	}
+
+	/*
+	 * If we exhausted the stream, then use its latest LSNTime as our
+	 * interpolation start point.
+	 */
+	if (stream->length > 0)
+		start = stream->data[stream->length - 1];
+
+stop:
+
+	/*
+	 * In rare cases, the start and end LSN could be the same. If, for
+	 * example, no new records have been inserted since the last one recorded
+	 * in the LSNTimeStream and we are looking for the LSN corresponding to
+	 * the current time.
+	 */
+	if (end.lsn == start.lsn)
+		return end.lsn;
+
+	Assert(end.lsn > start.lsn);
+
+	/*
+	 * It should be extremely rare (if not impossible) for the start time and
+	 * end time to be the same. In this case, just return an LSN halfway
+	 * between the two.
+	 */
+	if (end.time == start.time)
+		return start.lsn + ((end.lsn - start.lsn) / 2);
+
+	Assert(end.time > start.time);
+
+	time_elapsed = end.time - start.time;
+	lsns_elapsed = end.lsn - start.lsn;
+
+	result = (double) (time - start.time) / time_elapsed * lsns_elapsed + start.lsn;
+	return Max(result, 0);
+}
+
+/*
+ * Translate lsn to a time using the provided stream. The stream will not
+ * be modified.
+ */
+TimestampTz
+estimate_time_at_lsn(const LSNTimeStream *stream, XLogRecPtr lsn)
+{
+	int64		time_elapsed,
+				lsns_elapsed;
+	TimestampTz result;
+	LSNTime		start = {.time = PgStartTime,.lsn = PgStartLSN};
+	LSNTime		end = {.time = GetCurrentTimestamp(),.lsn = GetXLogInsertRecPtr()};
+
+	/*
+	 * If the database has been restarted, PgStartLSN may be after our oldest
+	 * value. In that case, use the oldest value in the time stream as the
+	 * start.
+	 */
+	if (stream->length > 0 && start.time > stream->data[0].time)
+		start = stream->data[0];
+
+	/*
+	 * If the LSN is before our oldest known LSN, the best we can do is return
+	 * our oldest known time.
+	 */
+	if (lsn < start.lsn)
+		return start.time;
+
+	/*
+	 * If the target LSN is after the current insert LSN, the current time is
+	 * our best estimate.
+	 */
+	if (lsn >= end.lsn)
+		return end.time;
+
+	/*
+	 * Loop through the stream. Stop at the first LSNTime later than our
+	 * target time. This LSNTime will be our interpolation end point. If
+	 * there's an LSNTime earlier than that, that will be our interpolation
+	 * start point.
+	 */
+	for (int i = 0; i < stream->length; i++)
+	{
+		if (stream->data[i].lsn < lsn)
+			continue;
+
+		end = stream->data[i];
+		if (i > 0)
+			start = stream->data[i - 1];
+		goto stop;
+	}
+
+	/*
+	 * If we exhausted the stream, then use its earliest LSNTime as our
+	 * interpolation end point.
+	 */
+	if (stream->length > 0)
+		start = stream->data[stream->length - 1];
+
+stop:
+
+	/* It should be nearly impossible to have the same start and end time. */
+	if (end.time == start.time)
+		return end.time;
+
+	Assert(end.time > start.time);
+
+	/*
+	 * In rare cases, the start and end LSN could be the same. If, for
+	 * example, no new records have been inserted since the last one recorded
+	 * in the LSNTimeStream and we are looking for the LSN corresponding to
+	 * the current time. In this case, just return a time halfway between
+	 * start and end.
+	 */
+	if (end.lsn == start.lsn)
+		return start.time + ((end.time - start.time) / 2);
+
+	Assert(end.lsn > start.lsn);
+
+	time_elapsed = end.time - start.time;
+	lsns_elapsed = end.lsn - start.lsn;
+
+	result = (double) (lsn - start.lsn) / lsns_elapsed * time_elapsed + start.time;
+	return Max(result, 0);
+}
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 6b99bb8aadf..825cdc8f73a 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -11,6 +11,7 @@
 #ifndef PGSTAT_H
 #define PGSTAT_H
 
+#include "access/xlogdefs.h"
 #include "datatype/timestamp.h"
 #include "portability/instr_time.h"
 #include "postmaster/pgarch.h"	/* for MAX_XFN_CHARS */
@@ -428,6 +429,37 @@ typedef struct PgStat_StatTabEntry
 	PgStat_Counter autoanalyze_count;
 } PgStat_StatTabEntry;
 
+/*
+ * The elements of an LSNTimeStream. Each LSNTime represents one or more time,
+ * LSN pairs. The LSN is typically the insert LSN recorded at the time.
+ */
+typedef struct LSNTime
+{
+	TimestampTz time;
+	XLogRecPtr	lsn;
+} LSNTime;
+
+#define LSNTIMESTREAM_VOLUME 64
+
+/*
+ * An LSN time stream is an array consisting of LSNTimes from least to most
+ * recent. The array is filled before any element is dropped. Once the
+ * LSNTimeStream length == volume (the array is full), an LSNTime is dropped,
+ * the subsequent LSNTimes are moved down by 1, and the new LSNTime is inserted
+ * at the tail.
+ *
+ * When dropping an LSNTime, we attempt to pick the member which would
+ * introduce the least error into the stream. See lsntime_to_drop() for more
+ * details.
+ *
+ * Use the stream for LSN <-> time conversion using linear interpolation.
+ */
+typedef struct LSNTimeStream
+{
+	int			length;
+	LSNTime		data[LSNTIMESTREAM_VOLUME];
+} LSNTimeStream;
+
 typedef struct PgStat_WalStats
 {
 	PgStat_Counter wal_records;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 8de9978ad8d..d924855069c 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1587,6 +1587,8 @@ LogicalTapeSet
 LsnReadQueue
 LsnReadQueueNextFun
 LsnReadQueueNextStatus
+LSNTime
+LSNTimeStream
 LtreeGistOptions
 LtreeSignature
 MAGIC
-- 
2.34.1

From 0ab41bd5030caf33c82692c7a3a6618a3771166f Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Wed, 27 Dec 2023 16:32:40 -0500
Subject: [PATCH v6 4/6] Bgwriter maintains global LSNTimeStream

Insert new LSN, time pairs to the global LSNTimeStream stored in
PgStat_WalStats in the background writer's main loop. This ensures that
new values are added to the stream in a regular manner.
---
 src/backend/postmaster/bgwriter.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index 0f75548759a..99c2e6eecc3 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -273,6 +273,7 @@ BackgroundWriterMain(char *startup_data, size_t startup_data_len)
 		{
 			TimestampTz timeout = 0;
 			TimestampTz now = GetCurrentTimestamp();
+			XLogRecPtr	current_lsn;
 
 			timeout = TimestampTzPlusMilliseconds(last_snapshot_ts,
 												  LOG_SNAPSHOT_INTERVAL_MS);
@@ -284,11 +285,23 @@ BackgroundWriterMain(char *startup_data, size_t startup_data_len)
 			 * start of a record, whereas last_snapshot_lsn points just past
 			 * the end of the record.
 			 */
-			if (now >= timeout &&
-				last_snapshot_lsn <= GetLastImportantRecPtr())
+			if (now >= timeout)
 			{
-				last_snapshot_lsn = LogStandbySnapshot();
-				last_snapshot_ts = now;
+				current_lsn = GetLastImportantRecPtr();
+				if (last_snapshot_lsn <= current_lsn)
+				{
+					last_snapshot_lsn = LogStandbySnapshot();
+					last_snapshot_ts = now;
+
+					/*
+					 * After a restart GetXLogInsertRecPtr() may return 0. We
+					 * don't want the timeline to move backwards, though, so
+					 * get the insert LSN instead.
+					 */
+					if (current_lsn == 0)
+						current_lsn = GetXLogInsertRecPtr();
+					pgstat_wal_update_lsntime_stream(now, current_lsn);
+				}
 			}
 		}
 
-- 
2.34.1

Reply via email to