Thanks for the review! Attached v4 implements the new algorithm/compression described in [1].
We had discussed off-list possibly using error in some way. So, I'm interested to know what you think about this method I suggested which calculates error. It doesn't save the error so that we could use it when interpolating for reasons I describe in that mail. If you have any ideas on how to use the calculated error or just how to combine error when dropping a point, that would be super helpful. Note that in this version, I've changed the name from LSNTimeline to LSNTimeStream to address some feedback from another reviewer about Timeline being already in use in Postgres as a concept. On Mon, Mar 18, 2024 at 10:02 AM Daniel Gustafsson <dan...@yesql.se> wrote: > > > On 22 Feb 2024, at 03:45, Melanie Plageman <melanieplage...@gmail.com> > > wrote: > > On Fri, Feb 16, 2024 at 3:41 PM Tomas Vondra > > <tomas.von...@enterprisedb.com> wrote: > >> - I wonder what happens if we lose the data - we know that if people > >> reset statistics for whatever reason (or just lose them because of a > >> crash, or because they're on a replica), bad things happen to > >> autovacuum. What's the (expected) impact on pruning? > > > > This is an important question. Because stats aren't crashsafe, we > > could return very inaccurate numbers for some time/LSN values if we > > crash. I don't actually know what we could do about that. When I use > > the LSNTimeline for the freeze heuristic it is less of an issue > > because the freeze heuristic has a fallback strategy when there aren't > > enough stats to do its calculations. But other users of this > > LSNTimeline will simply get highly inaccurate results (I think?). Is > > there anything we could do about this? It seems bad. > > A complication with this over stats is that we can't recompute this in case of > a crash/corruption issue. The simple solution is to consider this unlogged > data and start fresh at every unclean shutdown, but I have a feeling that > won't > be good enough for basing heuristics on? Yes, I still haven't dealt with this yet. Tomas had a list of suggestions in an upthread email, so I will spend some time thinking about those next. It seems like we might be able to come up with some way of calculating a valid "default" value or "best guess" which could be used whenever there isn't enough data. Though, if we crash and lose some time stream data, we won't know that that data was lost due to a crash so we wouldn't know to use our "best guess" to make up for it. So, maybe I should try and rebuild the stream using some combination of WAL, clog, and commit timestamps? Or perhaps I should do some basic WAL logging just for this data structure. > > Andres had brought up something at some point about, what if the > > database is simply turned off for awhile and then turned back on. Even > > if you cleanly shut down, will there be "gaps" in the timeline? I > > think that could be okay, but it is something to think about. > > The gaps would represent reality, so there is nothing wrong per se with gaps, > but if they inflate the interval of a bucket which in turns impact the > precision of the results then that can be a problem. Yes, actually I added some hacky code to the quick and dirty python simulator I wrote [2] to test out having a big gap with no updates (if there is no db activity so nothing for bgwriter to do or the db is off for a while). And it seemed to basically work fine. > While the bucketing algorithm is a clever algorithm for degrading precision > for > older entries without discarding them, I do fear that we'll risk ending up > with > answers like "somewhere between in the past and even further in the past". > I've been playing around with various compression algorithms for packing the > buckets such that we can retain precision for longer. Since you were aiming > to > work on other patches leading up to the freeze, let's pick this up again once > things calm down. Let me know what you think about the new algorithm. I wonder if for points older than the second to oldest point, we have the function return something like "older than a year ago" instead of guessing. The new method doesn't focus on compressing old data though. > When compiling I got this warning for lsntime_merge_target: > > pgstat_wal.c:242:1: warning: non-void function does not return a value in all > control paths [-Wreturn-type] > } > ^ > 1 warning generated. > > The issue seems to be that the can't-really-happen path is protected with an > Assertion, which is a no-op for production builds. I think we should handle > the error rather than rely on testing catching it (since if it does happen > even > though it can't, it's going to be when it's for sure not tested). Returning > an > invalid array subscript like -1 and testing for it in lsntime_insert, with an > elog(WARNING..), seems enough. > > > - last_snapshot_lsn <= GetLastImportantRecPtr()) > + last_snapshot_lsn <= current_lsn) > I think we should delay extracting the LSN with GetLastImportantRecPtr until > we > know that we need it, to avoid taking locks in this codepath unless needed. > > I've attached a diff with the above suggestions which applies on op of your > patchset. I've implemented these review points in the attached v4. - Melanie [1] https://www.postgresql.org/message-id/CAAKRu_YbbZGz-X_pm2zXJA%2B6A22YYpaWhOjmytqFL1yF_FCv6w%40mail.gmail.com [2] https://gist.github.com/melanieplageman/7400e81bbbd518fe08b4af55a9b632d4
From 1b86460bb25aef99a39034e5bf6be581cdccfb88 Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Tue, 5 Dec 2023 07:29:39 -0500 Subject: [PATCH v4 1/6] Record LSN at postmaster startup The insert_lsn at postmaster startup can be used along with PgStartTime as seed values for a timeline mapping LSNs to time. Future commits will add such a structure for LSN <-> time conversions. A start LSN allows for such conversions before even inserting a value into the timeline. The current time and current insert LSN can be used along with PgStartTime and PgStartLSN. This is WIP, as I'm not sure if I did this in the right place. --- src/backend/access/transam/xlog.c | 2 ++ src/backend/postmaster/postmaster.c | 2 ++ src/include/utils/builtins.h | 3 +++ 3 files changed, 7 insertions(+) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 330e058c5f2..6fff1f52084 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -142,6 +142,8 @@ bool XLOG_DEBUG = false; int wal_segment_size = DEFAULT_XLOG_SEG_SIZE; +XLogRecPtr PgStartLSN = InvalidXLogRecPtr; + /* * Number of WAL insertion locks to use. A higher value allows more insertions * to happen concurrently, but adds some CPU overhead to flushing the WAL, diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index bf0241aed0c..f1b60fe6cee 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -117,6 +117,7 @@ #include "storage/proc.h" #include "tcop/backend_startup.h" #include "tcop/tcopprot.h" +#include "utils/builtins.h" #include "utils/datetime.h" #include "utils/memutils.h" #include "utils/pidfile.h" @@ -1345,6 +1346,7 @@ PostmasterMain(int argc, char *argv[]) * Remember postmaster startup time */ PgStartTime = GetCurrentTimestamp(); + PgStartLSN = GetXLogInsertRecPtr(); /* * Report postmaster status in the postmaster.pid file, to allow pg_ctl to diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h index 359c570f23e..16a7a058bc7 100644 --- a/src/include/utils/builtins.h +++ b/src/include/utils/builtins.h @@ -17,6 +17,7 @@ #include "fmgr.h" #include "nodes/nodes.h" #include "utils/fmgrprotos.h" +#include "access/xlogdefs.h" /* Sign + the most decimal digits an 8-byte number could have */ #define MAXINT8LEN 20 @@ -85,6 +86,8 @@ extern void generate_operator_clause(fmStringInfo buf, Oid opoid, const char *rightop, Oid rightoptype); +extern PGDLLIMPORT XLogRecPtr PgStartLSN; + /* varchar.c */ extern int bpchartruelen(char *s, int len); -- 2.34.1
From 661d8f2db88a510efbbd7c19f0af13ee75416967 Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Wed, 27 Dec 2023 16:32:40 -0500 Subject: [PATCH v4 4/6] Bgwriter maintains global LSNTimeStream Insert new LSN, time pairs to the global LSNTimeStream stored in PgStat_WalStats in the background writer's main loop. This ensures that new values are added to the stream in a regular manner. --- src/backend/postmaster/bgwriter.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c index 0f75548759a..02b039cfacf 100644 --- a/src/backend/postmaster/bgwriter.c +++ b/src/backend/postmaster/bgwriter.c @@ -273,6 +273,7 @@ BackgroundWriterMain(char *startup_data, size_t startup_data_len) { TimestampTz timeout = 0; TimestampTz now = GetCurrentTimestamp(); + XLogRecPtr current_lsn; timeout = TimestampTzPlusMilliseconds(last_snapshot_ts, LOG_SNAPSHOT_INTERVAL_MS); @@ -284,11 +285,15 @@ BackgroundWriterMain(char *startup_data, size_t startup_data_len) * start of a record, whereas last_snapshot_lsn points just past * the end of the record. */ - if (now >= timeout && - last_snapshot_lsn <= GetLastImportantRecPtr()) + if (now >= timeout) { - last_snapshot_lsn = LogStandbySnapshot(); - last_snapshot_ts = now; + current_lsn = GetLastImportantRecPtr(); + if (last_snapshot_lsn <= current_lsn) + { + last_snapshot_lsn = LogStandbySnapshot(); + last_snapshot_ts = now; + pgstat_wal_update_lsntime_stream(now, current_lsn); + } } } -- 2.34.1
From 23db440712a45f7c58eb57933df61a9b2e40c6a0 Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Wed, 21 Feb 2024 20:06:29 -0500 Subject: [PATCH v4 5/6] Add time <-> LSN translation functions Previous commits added a global LSNTimeStream, maintained by background writer, that allows approximate translations between time and LSNs. Add SQL-callable functions to convert from LSN to time and back and a SQL-callable function returning the entire LSNTimeStream. This could be useful in combination with SQL-callable functions accessing a page LSN to approximate the time of last modification of a page or estimating the LSN consumption rate to moderate maintenance processes and balance system resource utilization. --- doc/src/sgml/monitoring.sgml | 66 +++++++++++++++++++++++++ src/backend/utils/activity/pgstat_wal.c | 56 +++++++++++++++++++++ src/include/catalog/pg_proc.dat | 22 +++++++++ 3 files changed, 144 insertions(+) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index b2ad9b446f3..1f7cd2f2f3b 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -3187,6 +3187,72 @@ description | Waiting for a newly initialized WAL file to reach durable storage </tgroup> </table> + <para> + In addition to these WAL stats, a stream of LSN <-> time pairs is accessible + via the functions shown in <xref linkend="functions-lsn-time-stream"/>. + </para> + + <table id="functions-lsn-time-stream"> + <title>LSN Time Stream Information Functions</title> + <tgroup cols="1"> + <thead> + <row> + <entry role="func_table_entry"><para role="func_signature"> + Function + </para> + <para> + Description + </para></entry> + </row> + </thead> + + <tbody> + <row> + <entry role="func_table_entry"><para role="func_signature"> + <indexterm> + <primary>pg_estimate_lsn_at_time</primary> + </indexterm> + <function>pg_estimate_lsn_at_time</function> ( <type>timestamp with time zone</type> ) + <returnvalue>pg_lsn</returnvalue> + </para> + <para> + Returns the estimated lsn at the provided time. + </para></entry> + </row> + + <row> + <entry role="func_table_entry"><para role="func_signature"> + <indexterm> + <primary>pg_estimate_lsn_at_time</primary> + </indexterm> + <function>pg_estimate_lsn_at_time</function> ( <type>pg_lsn</type> ) + <returnvalue>timestamp with time zone</returnvalue> + </para> + <para> + Returns the estimated time at the provided lsn. + </para></entry> + </row> + + <row> + <entry role="func_table_entry"><para role="func_signature"> + <indexterm> + <primary>pg_lsntime_stream</primary> + </indexterm> + <function>pg_lsntime_stream</function> () + <returnvalue>record</returnvalue> + ( <parameter>time</parameter> <type>timestamp with time zone</type>, + <parameter>lsn</parameter> <type>pg_lsnwith time zone</type>) + </para> + <para> + Returns all of the LSN <-> time pairs in the current LSN time stream. + </para></entry> + </row> + </tbody> + </tgroup> + </table> + + + </sect2> <sect2 id="monitoring-pg-stat-database-view"> diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c index eddd2ec03cb..5d5ab62d4ba 100644 --- a/src/backend/utils/activity/pgstat_wal.c +++ b/src/backend/utils/activity/pgstat_wal.c @@ -19,7 +19,9 @@ #include "access/xlog.h" #include "executor/instrument.h" +#include "funcapi.h" #include "utils/builtins.h" +#include "utils/pg_lsn.h" #include "utils/pgstat_internal.h" #include "utils/timestamp.h" @@ -427,3 +429,57 @@ pgstat_wal_update_lsntime_stream(TimestampTz time, XLogRecPtr lsn) lsntime_insert(&stats_shmem->stats.stream, time, lsn); LWLockRelease(&stats_shmem->lock); } + +PG_FUNCTION_INFO_V1(pg_estimate_lsn_at_time); +PG_FUNCTION_INFO_V1(pg_estimate_time_at_lsn); +PG_FUNCTION_INFO_V1(pg_lsntime_stream); + +Datum +pg_estimate_time_at_lsn(PG_FUNCTION_ARGS) +{ + XLogRecPtr lsn = PG_GETARG_LSN(0); + PgStatShared_Wal *stats_shmem = &pgStatLocal.shmem->wal; + TimestampTz result; + + LWLockAcquire(&stats_shmem->lock, LW_SHARED); + result = estimate_time_at_lsn(&stats_shmem->stats.stream, lsn); + LWLockRelease(&stats_shmem->lock); + + PG_RETURN_TIMESTAMPTZ(result); +} + +Datum +pg_estimate_lsn_at_time(PG_FUNCTION_ARGS) +{ + PgStatShared_Wal *stats_shmem = &pgStatLocal.shmem->wal; + TimestampTz time = PG_GETARG_TIMESTAMPTZ(0); + XLogRecPtr result; + + LWLockAcquire(&stats_shmem->lock, LW_SHARED); + result = estimate_lsn_at_time(&stats_shmem->stats.stream, time); + LWLockRelease(&stats_shmem->lock); + + PG_RETURN_LSN(result); +} + +Datum +pg_lsntime_stream(PG_FUNCTION_ARGS) +{ + ReturnSetInfo *rsinfo; + PgStat_WalStats *stats = pgstat_fetch_stat_wal(); + LSNTimeStream *stream = &stats->stream; + + InitMaterializedSRF(fcinfo, 0); + rsinfo = (ReturnSetInfo *) fcinfo->resultinfo; + for (int i = LSNTIMESTREAM_VOLUME - stream->length; i < LSNTIMESTREAM_VOLUME; i++) + { + Datum values[2] = {0}; + bool nulls[2] = {0}; + + values[0] = stream->data[i].time; + values[1] = stream->data[i].lsn; + tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, + values, nulls); + } + return (Datum) 0; +} diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 6a5476d3c4c..8ab14b49b2a 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -6342,6 +6342,28 @@ prorettype => 'timestamptz', proargtypes => 'xid', prosrc => 'pg_xact_commit_timestamp' }, +{ oid => '9997', + descr => 'get approximate LSN at a particular point in time', + proname => 'pg_estimate_lsn_at_time', provolatile => 'v', + prorettype => 'pg_lsn', proargtypes => 'timestamptz', + prosrc => 'pg_estimate_lsn_at_time' }, + +{ oid => '9996', + descr => 'get approximate time at a particular LSN', + proname => 'pg_estimate_time_at_lsn', provolatile => 'v', + prorettype => 'timestamptz', proargtypes => 'pg_lsn', + prosrc => 'pg_estimate_time_at_lsn' }, + +{ oid => '9994', + descr => 'print the LSN Time Stream', + proname => 'pg_lsntime_stream', prorows => '64', + proretset => 't', provolatile => 'v', proparallel => 's', + prorettype => 'record', proargtypes => '', + proallargtypes => '{timestamptz,pg_lsn}', + proargmodes => '{o,o}', + proargnames => '{time, lsn}', + prosrc => 'pg_lsntime_stream' }, + { oid => '6168', descr => 'get commit timestamp and replication origin of a transaction', proname => 'pg_xact_commit_timestamp_origin', provolatile => 'v', -- 2.34.1
From 9b8091be4a9e327e39bc4b9a5f4e5a438897c0e1 Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Wed, 21 Feb 2024 20:28:27 -0500 Subject: [PATCH v4 3/6] Add LSNTimeStream to PgStat_WalStats Add a globally maintained instance of an LSNTimeStream to PgStat_WalStats and a utility function to insert new values. --- src/backend/utils/activity/pgstat_wal.c | 10 ++++++++++ src/include/pgstat.h | 4 ++++ 2 files changed, 14 insertions(+) diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c index d76ace5cbfc..eddd2ec03cb 100644 --- a/src/backend/utils/activity/pgstat_wal.c +++ b/src/backend/utils/activity/pgstat_wal.c @@ -417,3 +417,13 @@ stop: result = (double) (lsn - start.lsn) / lsns_elapsed * time_elapsed + start.time; return Max(result, 0); } + +void +pgstat_wal_update_lsntime_stream(TimestampTz time, XLogRecPtr lsn) +{ + PgStatShared_Wal *stats_shmem = &pgStatLocal.shmem->wal; + + LWLockAcquire(&stats_shmem->lock, LW_EXCLUSIVE); + lsntime_insert(&stats_shmem->stats.stream, time, lsn); + LWLockRelease(&stats_shmem->lock); +} diff --git a/src/include/pgstat.h b/src/include/pgstat.h index af348be839c..773e3cd5003 100644 --- a/src/include/pgstat.h +++ b/src/include/pgstat.h @@ -470,6 +470,7 @@ typedef struct PgStat_WalStats PgStat_Counter wal_sync; PgStat_Counter wal_write_time; PgStat_Counter wal_sync_time; + LSNTimeStream stream; TimestampTz stat_reset_timestamp; } PgStat_WalStats; @@ -752,6 +753,9 @@ extern void pgstat_execute_transactional_drops(int ndrops, struct xl_xact_stats_ extern void pgstat_report_wal(bool force); extern PgStat_WalStats *pgstat_fetch_stat_wal(void); +/* Helpers for maintaining the LSNTimeStream */ +extern void pgstat_wal_update_lsntime_stream(TimestampTz time, XLogRecPtr lsn); + /* * Variables in pgstat.c -- 2.34.1
From 0b2ab6bb6507cec869f6f55a78016d8c446a7b2f Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Wed, 27 Dec 2023 16:40:27 -0500 Subject: [PATCH v4 2/6] Add LSNTimeStream for converting LSN <-> time Add a new structure, LSNTimeStream, consisting of LSNTimes -- each an LSN, time pair. The LSNTimeStream is fixed size, so when a new LSNTime is inserted to a full LSNTimeStream, an LSNTime is dropped and the new LSNTime is inserted. We drop the LSNTime whose absence would cause the least error when interpolating between its adjoining points. LSN <-> time conversions can be done using linear interpolation with two LSNTimes on the LSNTimeStream. This commit does not add a global instance of LSNTimeStream. It adds the structures and functions needed to maintain and access such a stream. --- src/backend/utils/activity/pgstat_wal.c | 233 ++++++++++++++++++++++++ src/include/pgstat.h | 32 ++++ src/tools/pgindent/typedefs.list | 2 + 3 files changed, 267 insertions(+) diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c index 0e374f133a9..d76ace5cbfc 100644 --- a/src/backend/utils/activity/pgstat_wal.c +++ b/src/backend/utils/activity/pgstat_wal.c @@ -17,8 +17,11 @@ #include "postgres.h" +#include "access/xlog.h" #include "executor/instrument.h" +#include "utils/builtins.h" #include "utils/pgstat_internal.h" +#include "utils/timestamp.h" PgStat_PendingWalStats PendingWalStats = {0}; @@ -32,6 +35,11 @@ PgStat_PendingWalStats PendingWalStats = {0}; static WalUsage prevWalUsage; +static void lsntime_insert(LSNTimeStream *stream, TimestampTz time, XLogRecPtr lsn); + +XLogRecPtr estimate_lsn_at_time(const LSNTimeStream *stream, TimestampTz time); +TimestampTz estimate_time_at_lsn(const LSNTimeStream *stream, XLogRecPtr lsn); + /* * Calculate how much WAL usage counters have increased and update * shared WAL and IO statistics. @@ -184,3 +192,228 @@ pgstat_wal_snapshot_cb(void) sizeof(pgStatLocal.snapshot.wal)); LWLockRelease(&stats_shmem->lock); } + +/* + * Given three LSNTimes, calculate the area of the triangle they form were they + * plotted with time on the X axis and LSN on the Y axis. + */ +static int +lsn_ts_calculate_error_area(LSNTime *left, LSNTime *mid, LSNTime *right) +{ + int rectangle_all = (right->time - left->time) * (right->lsn - left->lsn); + int triangle1 = rectangle_all / 2; + int triangle2 = (mid->lsn - left->lsn) * (mid->time - left->time) / 2; + int triangle3 = (right->lsn - mid->lsn) * (right->time - mid->time) / 2; + int rectangle_part = (right->lsn - mid->lsn) * (mid->time - left->time); + + return rectangle_all - triangle1 - triangle2 - triangle3 - rectangle_part; +} + +/* + * Determine which LSNTime to drop from a full LSNTimeStream. Once the LSNTime + * is dropped, points between it and either of its adjacent LSNTimes will be + * interpolated between those two LSNTimes instead. To keep the LSNTimeStream + * as accurate as possible, drop the LSNTime whose absence would have the least + * impact on future interpolations. + * + * We determine the error that would be introduced by dropping a point on the + * stream by calculating the area of the triangle formed by the LSNTime and its + * adjacent LSNTimes. We do this for each LSNTime in the stream (except for the + * first and last LSNTimes) and choose the LSNTime with the smallest error + * (area). We avoid extrapolation by never dropping the first or last points. + */ +static int +lsntime_to_drop(LSNTimeStream *stream) +{ + int min_area = INT_MAX; + int target_point = stream->length - 1; + + /* Don't drop points if free space available */ + Assert(stream->length == LSNTIMESTREAM_VOLUME); + + for (int i = stream->length - 1; i-- > 0;) + { + LSNTime *left = &stream->data[i - 1]; + LSNTime *mid = &stream->data[i]; + LSNTime *right = &stream->data[i + 1]; + int area = lsn_ts_calculate_error_area(left, mid, right); + + if (abs(area) < abs(min_area)) + { + min_area = area; + target_point = i; + } + } + + return target_point; +} + +/* + * Insert a new LSNTime into the LSNTimeStream in the first available element, + * or, if there are no empty elements, drop an LSNTime from the stream, move + * all LSNTimes down and insert the new LSNTime into the element at index 0. + */ +void +lsntime_insert(LSNTimeStream *stream, TimestampTz time, + XLogRecPtr lsn) +{ + int drop; + LSNTime entrant = {.lsn = lsn,.time = time}; + + if (stream->length < LSNTIMESTREAM_VOLUME) + { + /* + * The new entry should exceed the most recent entry to ensure time + * moves forward on the stream. + */ + Assert(stream->length == 0 || + (lsn >= stream->data[LSNTIMESTREAM_VOLUME - stream->length].lsn && + time >= stream->data[LSNTIMESTREAM_VOLUME - stream->length].time)); + + /* + * If there are unfilled elements in the stream, insert the passed-in + * LSNTime into the tail of the array. + */ + stream->length++; + stream->data[LSNTIMESTREAM_VOLUME - stream->length] = entrant; + return; + } + + drop = lsntime_to_drop(stream); + if (drop < 0 || drop >= stream->length) + { + elog(WARNING, "Unable to insert LSNTime to LSNTimeStream. Drop failed."); + return; + } + + /* + * Drop the LSNTime at index drop by copying the array from drop - 1 to + * drop + */ + memmove(&stream->data[1], &stream->data[0], sizeof(LSNTime) * drop); + stream->data[0] = entrant; +} + +/* + * Translate time to a LSN using the provided stream. The stream will not + * be modified. + */ +XLogRecPtr +estimate_lsn_at_time(const LSNTimeStream *stream, TimestampTz time) +{ + XLogRecPtr result; + int64 time_elapsed, + lsns_elapsed; + LSNTime start = {.time = PgStartTime,.lsn = PgStartLSN}; + LSNTime end = {.time = GetCurrentTimestamp(),.lsn = GetXLogInsertRecPtr()}; + + /* + * If the provided time is before DB startup, the best we can do is return + * the start LSN. + */ + if (time < start.time) + return start.lsn; + + /* + * If the provided time is after now, the current LSN is our best + * estimate. + */ + if (time >= end.time) + return end.lsn; + + /* + * Loop through the stream. Stop at the first LSNTime earlier than our + * target time. This LSNTime will be our interpolation start point. If + * there's an LSNTime later than that, then that will be our interpolation + * end point. + */ + for (int i = LSNTIMESTREAM_VOLUME - stream->length; i < LSNTIMESTREAM_VOLUME; i++) + { + if (stream->data[i].time > time) + continue; + + start = stream->data[i]; + if (i > 0) + end = stream->data[i - 1]; + goto stop; + } + + /* + * If we exhausted the stream, then use its earliest LSNTime as our + * interpolation end point. + */ + if (stream->length > 0) + end = stream->data[stream->length - 1]; + +stop: + Assert(end.time > start.time); + Assert(end.lsn > start.lsn); + time_elapsed = end.time - start.time; + Assert(time_elapsed != 0); + lsns_elapsed = end.lsn - start.lsn; + Assert(lsns_elapsed != 0); + result = (double) (time - start.time) / time_elapsed * lsns_elapsed + start.lsn; + return Max(result, 0); +} + +/* + * Translate lsn to a time using the provided stream. The stream will not + * be modified. + */ +TimestampTz +estimate_time_at_lsn(const LSNTimeStream *stream, XLogRecPtr lsn) +{ + int64 time_elapsed, + lsns_elapsed; + TimestampTz result; + LSNTime start = {.time = PgStartTime,.lsn = PgStartLSN}; + LSNTime end = {.time = GetCurrentTimestamp(),.lsn = GetXLogInsertRecPtr()}; + + /* + * If the LSN is before DB startup, the best we can do is return that + * time. + */ + if (lsn <= start.lsn) + return start.time; + + /* + * If the target LSN is after the current insert LSN, the current time is + * our best estimate. + */ + if (lsn >= end.lsn) + return end.time; + + /* + * Loop through the stream. Stop at the first LSNTime earlier than our + * target LSN. This LSNTime will be our interpolation start point. If + * there's an LSNTime later than that, then that will be our interpolation + * end point. + */ + for (int i = LSNTIMESTREAM_VOLUME - stream->length; i < LSNTIMESTREAM_VOLUME; i++) + { + if (stream->data[i].lsn > lsn) + continue; + + start = stream->data[i]; + if (i > 0) + end = stream->data[i - 1]; + goto stop; + } + + /* + * If we exhausted the stream, then use its earliest LSNTime as our + * interpolation end point. + */ + if (stream->length > 0) + end = stream->data[stream->length - 1]; + +stop: + Assert(end.time > start.time); + Assert(end.lsn > start.lsn); + time_elapsed = end.time - start.time; + Assert(time_elapsed != 0); + lsns_elapsed = end.lsn - start.lsn; + Assert(lsns_elapsed != 0); + result = (double) (lsn - start.lsn) / lsns_elapsed * time_elapsed + start.time; + return Max(result, 0); +} diff --git a/src/include/pgstat.h b/src/include/pgstat.h index 2136239710e..af348be839c 100644 --- a/src/include/pgstat.h +++ b/src/include/pgstat.h @@ -11,6 +11,7 @@ #ifndef PGSTAT_H #define PGSTAT_H +#include "access/xlogdefs.h" #include "datatype/timestamp.h" #include "portability/instr_time.h" #include "postmaster/pgarch.h" /* for MAX_XFN_CHARS */ @@ -428,6 +429,37 @@ typedef struct PgStat_StatTabEntry PgStat_Counter autoanalyze_count; } PgStat_StatTabEntry; +/* + * The elements of an LSNTimeStream. Each LSNTime represents one or more time, + * LSN pairs. The LSN is typically the insert LSN recorded at the time. + */ +typedef struct LSNTime +{ + TimestampTz time; + XLogRecPtr lsn; +} LSNTime; + +#define LSNTIMESTREAM_VOLUME 64 + +/* + * An LSN time stream is an array consisting of LSNTimes from most to least + * recent. The array is filled from end to start before the contents of any + * elements are merged. Once the LSNTimeStream length == volume (the array is + * full), an LSNTime is dropped, the new LSNTime is added at index 0, and the + * intervening LSNTimes are moved down by one. + * + * When dropping an LSNTime, we attempt to pick the member which would + * introduce the least error into the stream. See lsntime_to_drop() for more + * details. + * + * Use the stream for LSN <-> time conversion using linear interpolation. + */ +typedef struct LSNTimeStream +{ + int length; + LSNTime data[LSNTIMESTREAM_VOLUME]; +} LSNTimeStream; + typedef struct PgStat_WalStats { PgStat_Counter wal_records; diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index 61ad417cde6..4c065e24ba7 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -1584,6 +1584,8 @@ LogicalTapeSet LsnReadQueue LsnReadQueueNextFun LsnReadQueueNextStatus +LSNTime +LSNTimeStream LtreeGistOptions LtreeSignature MAGIC -- 2.34.1