On Wed, Apr 15, 2020 at 2:21 PM Thomas Munro <thomas.mu...@gmail.com> wrote:
> On Mon, Apr 13, 2020 at 2:58 PM Thomas Munro <thomas.mu...@gmail.com> wrote:
> > On Fri, Apr 3, 2020 at 2:22 PM Peter Geoghegan <p...@bowt.ie> wrote:
> > > I'm thinking of a version of "snapshot too old" that amounts to a
> > > statement timeout that gets applied for xmin horizon type purposes in
> > > the conventional way, while only showing an error to the client if and
> > > when they access literally any buffer (though not when the relation is
> > > a system catalog). Is it possible that something along those lines is
> > > appreciably better than nothing to users? If it is, and if we can find
> > > a way to manage the transition, then maybe we could tolerate
> > > supporting this greatly simplified implementation of "snapshot too
> > > old".

I rebased that patch and fleshed it out just a bit more.  Warning:
experiment grade, incomplet, inkorrect, but I think it demonstrates
the main elements of Peter's idea from last year.

For READ COMMITTED, the user experience is much like a statement
timeout, except that it can keep doing stuff that doesn't read
non-catalog data.  Trivial example: pg_sleep(10) happily completes
with old_snapshot_threshold=5s, as do queries that materialise all
their input data in time, and yet your xmin is zapped.  For REPEATABLE
READ it's obviously tied to your first query, and produces "snapshot
too old" if you repeatedly SELECT from a little table and your time
runs out.

In this version I put the check into the heapam visibility + vismap
checks, instead of in the buffer access code.  The reason is that the
lower level buffer access routines don't have a snapshot, but if you
push the check down to buffer access and just check the "oldest"
snapshot (definition in this patch, not in master), then you lose some
potential granularity with different cursors.  If you try to put it at
a higher level in places that have a snapshot and access a buffer, you
run into the problem of being uncertain that you've covered all the
bases.  But I may be underthinking that.

Quite a few unresolved questions about catalog and toast snapshots and
other special stuff, as well as the question of whether it's actually
useful or the overheads can be made cheap enough.

> Hmm.  I suppose it must be possible to put the LSN check back: if
> (snapshot->too_old && PageGetLSN(page) > snapshot->lsn) ereport(...).
> Then the granularity would be the same as today -- block level -- but
> the complexity is transferred from the pruning side (has to deal with
> xid time map) to the snapshot-owning side (has to deal with timers,
> CFI() and make sure all snapshots are registered).  Maybe not a great
> deal, and maybe not easier than fixing the existing bugs.

It is a shame to lose the current LSN-based logic; it's really rather
clever (except for being broken).

> One problem is all the new setitimer() syscalls.  I feel like that
> could be improved, as could statement_timeout, by letting existing
> timers run rather than repeatedly rescheduling eagerly, so that eg a 1
> minute timeout never gets rescheduled more than once per minute.  I
> haven't looked into that, but I guess it's no worse than the existing
> implement's overheads anyway.

At least that problem was fixed, by commit 09cf1d52.  (Not entirely
sure why I got away with not reenabling the timer between queries, but
I didn't look very hard).
From 656af1863e841664d4c342bb7c783834912d3906 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Wed, 15 Apr 2020 12:35:51 +1200
Subject: [PATCH v2] Implement snapshot_too_old using a timer.

Remove the previous implementation of "snapshot too old", and replace it
with a very conservative implementation based only on time passing.

Use a timer to mark active and registered snapshots as too old and
potentially advance the backend's xmin.  Snapshots that reach this state
can no longer be used to read buffers, because data they need to see may
be vacuumed away.

XXX Incomplete experimental code!
---
 contrib/Makefile                              |   1 -
 contrib/bloom/blscan.c                        |   1 -
 contrib/old_snapshot/Makefile                 |  21 -
 contrib/old_snapshot/old_snapshot--1.0.sql    |  14 -
 contrib/old_snapshot/old_snapshot.control     |   5 -
 contrib/old_snapshot/time_mapping.c           | 160 ----
 src/backend/access/brin/brin_revmap.c         |   7 -
 src/backend/access/common/toast_internals.c   |   5 +-
 src/backend/access/gin/ginbtree.c             |   2 -
 src/backend/access/gin/ginget.c               |   4 -
 src/backend/access/gist/gistget.c             |   1 -
 src/backend/access/hash/hashsearch.c          |   6 -
 src/backend/access/heap/heapam.c              |  11 -
 src/backend/access/heap/heapam_visibility.c   |   2 +
 src/backend/access/heap/pruneheap.c           |  96 +--
 src/backend/access/heap/vacuumlazy.c          |   4 +-
 src/backend/access/nbtree/nbtsearch.c         |   9 -
 src/backend/access/spgist/spgscan.c           |   1 -
 src/backend/catalog/index.c                   |  15 +-
 src/backend/commands/vacuum.c                 |  19 -
 src/backend/executor/nodeBitmapHeapscan.c     |   6 +
 src/backend/executor/nodeIndexonlyscan.c      |   1 +
 src/backend/storage/buffer/bufmgr.c           |  17 -
 src/backend/storage/ipc/ipci.c                |   2 -
 src/backend/storage/ipc/procarray.c           |  36 +-
 src/backend/tcop/postgres.c                   |   6 +
 src/backend/utils/init/globals.c              |   1 +
 src/backend/utils/init/postinit.c             |  10 +
 src/backend/utils/misc/guc.c                  |   4 +-
 src/backend/utils/time/snapmgr.c              | 715 ++++++------------
 src/include/access/heapam.h                   |   2 -
 src/include/miscadmin.h                       |   1 +
 src/include/storage/bufmgr.h                  |  50 --
 src/include/utils/snapmgr.h                   |  54 +-
 src/include/utils/snapshot.h                  |  43 +-
 src/include/utils/timeout.h                   |   1 +
 src/test/modules/Makefile                     |   1 -
 src/test/modules/snapshot_too_old/.gitignore  |   3 -
 src/test/modules/snapshot_too_old/Makefile    |  28 -
 .../expected/sto_using_cursor.out             |  73 --
 .../expected/sto_using_hash_index.out         |  15 -
 .../expected/sto_using_select.out             |  55 --
 .../specs/sto_using_cursor.spec               |  37 -
 .../specs/sto_using_hash_index.spec           |  31 -
 .../specs/sto_using_select.spec               |  36 -
 src/test/modules/snapshot_too_old/sto.conf    |   2 -
 46 files changed, 323 insertions(+), 1291 deletions(-)
 delete mode 100644 contrib/old_snapshot/Makefile
 delete mode 100644 contrib/old_snapshot/old_snapshot--1.0.sql
 delete mode 100644 contrib/old_snapshot/old_snapshot.control
 delete mode 100644 contrib/old_snapshot/time_mapping.c
 delete mode 100644 src/test/modules/snapshot_too_old/.gitignore
 delete mode 100644 src/test/modules/snapshot_too_old/Makefile
 delete mode 100644 src/test/modules/snapshot_too_old/expected/sto_using_cursor.out
 delete mode 100644 src/test/modules/snapshot_too_old/expected/sto_using_hash_index.out
 delete mode 100644 src/test/modules/snapshot_too_old/expected/sto_using_select.out
 delete mode 100644 src/test/modules/snapshot_too_old/specs/sto_using_cursor.spec
 delete mode 100644 src/test/modules/snapshot_too_old/specs/sto_using_hash_index.spec
 delete mode 100644 src/test/modules/snapshot_too_old/specs/sto_using_select.spec
 delete mode 100644 src/test/modules/snapshot_too_old/sto.conf

diff --git a/contrib/Makefile b/contrib/Makefile
index f27e458482..1b0506601e 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -27,7 +27,6 @@ SUBDIRS = \
 		lo		\
 		ltree		\
 		oid2name	\
-		old_snapshot	\
 		pageinspect	\
 		passwordcheck	\
 		pg_buffercache	\
diff --git a/contrib/bloom/blscan.c b/contrib/bloom/blscan.c
index 6ae3710d9f..092015181b 100644
--- a/contrib/bloom/blscan.c
+++ b/contrib/bloom/blscan.c
@@ -132,7 +132,6 @@ blgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
 
 		LockBuffer(buffer, BUFFER_LOCK_SHARE);
 		page = BufferGetPage(buffer);
-		TestForOldSnapshot(scan->xs_snapshot, scan->indexRelation, page);
 
 		if (!PageIsNew(page) && !BloomPageIsDeleted(page))
 		{
diff --git a/contrib/old_snapshot/Makefile b/contrib/old_snapshot/Makefile
deleted file mode 100644
index adb557532f..0000000000
--- a/contrib/old_snapshot/Makefile
+++ /dev/null
@@ -1,21 +0,0 @@
-# contrib/old_snapshot/Makefile
-
-MODULE_big = old_snapshot
-OBJS = \
-	$(WIN32RES) \
-	time_mapping.o
-
-EXTENSION = old_snapshot
-DATA = old_snapshot--1.0.sql
-PGFILEDESC = "old_snapshot - utilities in support of old_snapshot_threshold"
-
-ifdef USE_PGXS
-PG_CONFIG = pg_config
-PGXS := $(shell $(PG_CONFIG) --pgxs)
-include $(PGXS)
-else
-subdir = contrib/old_snapshot
-top_builddir = ../..
-include $(top_builddir)/src/Makefile.global
-include $(top_srcdir)/contrib/contrib-global.mk
-endif
diff --git a/contrib/old_snapshot/old_snapshot--1.0.sql b/contrib/old_snapshot/old_snapshot--1.0.sql
deleted file mode 100644
index 9ebb8829e3..0000000000
--- a/contrib/old_snapshot/old_snapshot--1.0.sql
+++ /dev/null
@@ -1,14 +0,0 @@
-/* contrib/old_snapshot/old_snapshot--1.0.sql */
-
--- complain if script is sourced in psql, rather than via CREATE EXTENSION
-\echo Use "CREATE EXTENSION old_snapshot" to load this file. \quit
-
--- Show visibility map and page-level visibility information for each block.
-CREATE FUNCTION pg_old_snapshot_time_mapping(array_offset OUT int4,
-											 end_timestamp OUT timestamptz,
-											 newest_xmin OUT xid)
-RETURNS SETOF record
-AS 'MODULE_PATHNAME', 'pg_old_snapshot_time_mapping'
-LANGUAGE C STRICT;
-
--- XXX. Do we want REVOKE commands here?
diff --git a/contrib/old_snapshot/old_snapshot.control b/contrib/old_snapshot/old_snapshot.control
deleted file mode 100644
index 491eec536c..0000000000
--- a/contrib/old_snapshot/old_snapshot.control
+++ /dev/null
@@ -1,5 +0,0 @@
-# old_snapshot extension
-comment = 'utilities in support of old_snapshot_threshold'
-default_version = '1.0'
-module_pathname = '$libdir/old_snapshot'
-relocatable = true
diff --git a/contrib/old_snapshot/time_mapping.c b/contrib/old_snapshot/time_mapping.c
deleted file mode 100644
index 02acf77b1a..0000000000
--- a/contrib/old_snapshot/time_mapping.c
+++ /dev/null
@@ -1,160 +0,0 @@
-/*-------------------------------------------------------------------------
- *
- * time_mapping.c
- *	  time to XID mapping information
- *
- * Copyright (c) 2020-2021, PostgreSQL Global Development Group
- *
- *	  contrib/old_snapshot/time_mapping.c
- *-------------------------------------------------------------------------
- */
-#include "postgres.h"
-
-#include "funcapi.h"
-#include "storage/lwlock.h"
-#include "utils/old_snapshot.h"
-#include "utils/snapmgr.h"
-#include "utils/timestamp.h"
-
-/*
- * Backend-private copy of the information from oldSnapshotControl which relates
- * to the time to XID mapping, plus an index so that we can iterate.
- *
- * Note that the length of the xid_by_minute array is given by
- * OLD_SNAPSHOT_TIME_MAP_ENTRIES (which is not a compile-time constant).
- */
-typedef struct
-{
-	int			current_index;
-	int			head_offset;
-	TimestampTz head_timestamp;
-	int			count_used;
-	TransactionId xid_by_minute[FLEXIBLE_ARRAY_MEMBER];
-} OldSnapshotTimeMapping;
-
-#define NUM_TIME_MAPPING_COLUMNS 3
-
-PG_MODULE_MAGIC;
-PG_FUNCTION_INFO_V1(pg_old_snapshot_time_mapping);
-
-static OldSnapshotTimeMapping *GetOldSnapshotTimeMapping(void);
-static TupleDesc MakeOldSnapshotTimeMappingTupleDesc(void);
-static HeapTuple MakeOldSnapshotTimeMappingTuple(TupleDesc tupdesc,
-												 OldSnapshotTimeMapping *mapping);
-
-/*
- * SQL-callable set-returning function.
- */
-Datum
-pg_old_snapshot_time_mapping(PG_FUNCTION_ARGS)
-{
-	FuncCallContext *funcctx;
-	OldSnapshotTimeMapping *mapping;
-
-	if (SRF_IS_FIRSTCALL())
-	{
-		MemoryContext oldcontext;
-
-		funcctx = SRF_FIRSTCALL_INIT();
-		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
-		mapping = GetOldSnapshotTimeMapping();
-		funcctx->user_fctx = mapping;
-		funcctx->tuple_desc = MakeOldSnapshotTimeMappingTupleDesc();
-		MemoryContextSwitchTo(oldcontext);
-	}
-
-	funcctx = SRF_PERCALL_SETUP();
-	mapping = (OldSnapshotTimeMapping *) funcctx->user_fctx;
-
-	while (mapping->current_index < mapping->count_used)
-	{
-		HeapTuple	tuple;
-
-		tuple = MakeOldSnapshotTimeMappingTuple(funcctx->tuple_desc, mapping);
-		++mapping->current_index;
-		SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(tuple));
-	}
-
-	SRF_RETURN_DONE(funcctx);
-}
-
-/*
- * Get the old snapshot time mapping data from shared memory.
- */
-static OldSnapshotTimeMapping *
-GetOldSnapshotTimeMapping(void)
-{
-	OldSnapshotTimeMapping *mapping;
-
-	mapping = palloc(offsetof(OldSnapshotTimeMapping, xid_by_minute)
-					 + sizeof(TransactionId) * OLD_SNAPSHOT_TIME_MAP_ENTRIES);
-	mapping->current_index = 0;
-
-	LWLockAcquire(OldSnapshotTimeMapLock, LW_SHARED);
-	mapping->head_offset = oldSnapshotControl->head_offset;
-	mapping->head_timestamp = oldSnapshotControl->head_timestamp;
-	mapping->count_used = oldSnapshotControl->count_used;
-	for (int i = 0; i < OLD_SNAPSHOT_TIME_MAP_ENTRIES; ++i)
-		mapping->xid_by_minute[i] = oldSnapshotControl->xid_by_minute[i];
-	LWLockRelease(OldSnapshotTimeMapLock);
-
-	return mapping;
-}
-
-/*
- * Build a tuple descriptor for the pg_old_snapshot_time_mapping() SRF.
- */
-static TupleDesc
-MakeOldSnapshotTimeMappingTupleDesc(void)
-{
-	TupleDesc	tupdesc;
-
-	tupdesc = CreateTemplateTupleDesc(NUM_TIME_MAPPING_COLUMNS);
-
-	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "array_offset",
-					   INT4OID, -1, 0);
-	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "end_timestamp",
-					   TIMESTAMPTZOID, -1, 0);
-	TupleDescInitEntry(tupdesc, (AttrNumber) 3, "newest_xmin",
-					   XIDOID, -1, 0);
-
-	return BlessTupleDesc(tupdesc);
-}
-
-/*
- * Convert one entry from the old snapshot time mapping to a HeapTuple.
- */
-static HeapTuple
-MakeOldSnapshotTimeMappingTuple(TupleDesc tupdesc, OldSnapshotTimeMapping *mapping)
-{
-	Datum		values[NUM_TIME_MAPPING_COLUMNS];
-	bool		nulls[NUM_TIME_MAPPING_COLUMNS];
-	int			array_position;
-	TimestampTz timestamp;
-
-	/*
-	 * Figure out the array position corresponding to the current index.
-	 *
-	 * Index 0 means the oldest entry in the mapping, which is stored at
-	 * mapping->head_offset. Index 1 means the next-oldest entry, which is a
-	 * the following index, and so on. We wrap around when we reach the end of
-	 * the array.
-	 */
-	array_position = (mapping->head_offset + mapping->current_index)
-		% OLD_SNAPSHOT_TIME_MAP_ENTRIES;
-
-	/*
-	 * No explicit timestamp is stored for any entry other than the oldest
-	 * one, but each entry corresponds to 1-minute period, so we can just add.
-	 */
-	timestamp = TimestampTzPlusMilliseconds(mapping->head_timestamp,
-											mapping->current_index * 60000);
-
-	/* Initialize nulls and values arrays. */
-	memset(nulls, 0, sizeof(nulls));
-	values[0] = Int32GetDatum(array_position);
-	values[1] = TimestampTzGetDatum(timestamp);
-	values[2] = TransactionIdGetDatum(mapping->xid_by_minute[array_position]);
-
-	return heap_form_tuple(tupdesc, values, nulls);
-}
diff --git a/src/backend/access/brin/brin_revmap.c b/src/backend/access/brin/brin_revmap.c
index c574c8a06e..99ee9e6b7d 100644
--- a/src/backend/access/brin/brin_revmap.c
+++ b/src/backend/access/brin/brin_revmap.c
@@ -79,7 +79,6 @@ brinRevmapInitialize(Relation idxrel, BlockNumber *pagesPerRange,
 	meta = ReadBuffer(idxrel, BRIN_METAPAGE_BLKNO);
 	LockBuffer(meta, BUFFER_LOCK_SHARE);
 	page = BufferGetPage(meta);
-	TestForOldSnapshot(snapshot, idxrel, page);
 	metadata = (BrinMetaPageData *) PageGetContents(page);
 
 	revmap = palloc(sizeof(BrinRevmap));
@@ -277,7 +276,6 @@ brinGetTupleForHeapBlock(BrinRevmap *revmap, BlockNumber heapBlk,
 		}
 		LockBuffer(*buf, mode);
 		page = BufferGetPage(*buf);
-		TestForOldSnapshot(snapshot, idxRel, page);
 
 		/* If we land on a revmap page, start over */
 		if (BRIN_IS_REGULAR_PAGE(page))
@@ -372,11 +370,6 @@ brinRevmapDesummarizeRange(Relation idxrel, BlockNumber heapBlk)
 	LockBuffer(regBuf, BUFFER_LOCK_EXCLUSIVE);
 	regPg = BufferGetPage(regBuf);
 
-	/*
-	 * We're only removing data, not reading it, so there's no need to
-	 * TestForOldSnapshot here.
-	 */
-
 	/* if this is no longer a regular page, tell caller to start over */
 	if (!BRIN_IS_REGULAR_PAGE(regPg))
 	{
diff --git a/src/backend/access/common/toast_internals.c b/src/backend/access/common/toast_internals.c
index c7b9ade574..e04a03bc04 100644
--- a/src/backend/access/common/toast_internals.c
+++ b/src/backend/access/common/toast_internals.c
@@ -632,8 +632,7 @@ toast_close_indexes(Relation *toastidxs, int num_indexes, LOCKMODE lock)
  *
  *	Initialize an appropriate TOAST snapshot.  We must use an MVCC snapshot
  *	to initialize the TOAST snapshot; since we don't know which one to use,
- *	just use the oldest one.  This is safe: at worst, we will get a "snapshot
- *	too old" error that might have been avoided otherwise.
+ *	just use the oldest one.
  */
 void
 init_toast_snapshot(Snapshot toast_snapshot)
@@ -656,5 +655,5 @@ init_toast_snapshot(Snapshot toast_snapshot)
 	if (snapshot == NULL)
 		elog(ERROR, "cannot fetch toast data without an active snapshot");
 
-	InitToastSnapshot(*toast_snapshot, snapshot->lsn, snapshot->whenTaken);
+	InitToastSnapshot(*toast_snapshot);
 }
diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c
index 482cf10877..4fcafd6e04 100644
--- a/src/backend/access/gin/ginbtree.c
+++ b/src/backend/access/gin/ginbtree.c
@@ -100,7 +100,6 @@ ginFindLeafPage(GinBtree btree, bool searchMode,
 		stack->off = InvalidOffsetNumber;
 
 		page = BufferGetPage(stack->buffer);
-		TestForOldSnapshot(snapshot, btree->index, page);
 
 		access = ginTraverseLock(stack->buffer, searchMode);
 
@@ -127,7 +126,6 @@ ginFindLeafPage(GinBtree btree, bool searchMode,
 			stack->buffer = ginStepRight(stack->buffer, btree->index, access);
 			stack->blkno = rightlink;
 			page = BufferGetPage(stack->buffer);
-			TestForOldSnapshot(snapshot, btree->index, page);
 
 			if (!searchMode && GinPageIsIncompleteSplit(page))
 				ginFinishSplit(btree, stack, false, NULL);
diff --git a/src/backend/access/gin/ginget.c b/src/backend/access/gin/ginget.c
index 03191e016c..bca5c5eda6 100644
--- a/src/backend/access/gin/ginget.c
+++ b/src/backend/access/gin/ginget.c
@@ -155,7 +155,6 @@ collectMatchBitmap(GinBtreeData *btree, GinBtreeStack *stack,
 			return true;
 
 		page = BufferGetPage(stack->buffer);
-		TestForOldSnapshot(snapshot, btree->index, page);
 		itup = (IndexTuple) PageGetItem(page, PageGetItemId(page, stack->off));
 
 		/*
@@ -1457,7 +1456,6 @@ scanGetCandidate(IndexScanDesc scan, pendingPosition *pos)
 	for (;;)
 	{
 		page = BufferGetPage(pos->pendingBuffer);
-		TestForOldSnapshot(scan->xs_snapshot, scan->indexRelation, page);
 
 		maxoff = PageGetMaxOffsetNumber(page);
 		if (pos->firstOffset > maxoff)
@@ -1638,7 +1636,6 @@ collectMatchesForHeapRow(IndexScanDesc scan, pendingPosition *pos)
 			   sizeof(bool) * (pos->lastOffset - pos->firstOffset));
 
 		page = BufferGetPage(pos->pendingBuffer);
-		TestForOldSnapshot(scan->xs_snapshot, scan->indexRelation, page);
 
 		for (i = 0; i < so->nkeys; i++)
 		{
@@ -1841,7 +1838,6 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
 
 	LockBuffer(metabuffer, GIN_SHARE);
 	page = BufferGetPage(metabuffer);
-	TestForOldSnapshot(scan->xs_snapshot, scan->indexRelation, page);
 	blkno = GinPageGetMeta(page)->head;
 
 	/*
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index c8f7e781c6..9575288df5 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -346,7 +346,6 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem,
 	PredicateLockPage(r, BufferGetBlockNumber(buffer), scan->xs_snapshot);
 	gistcheckpage(scan->indexRelation, buffer);
 	page = BufferGetPage(buffer);
-	TestForOldSnapshot(scan->xs_snapshot, r, page);
 	opaque = GistPageGetOpaque(page);
 
 	/*
diff --git a/src/backend/access/hash/hashsearch.c b/src/backend/access/hash/hashsearch.c
index 2ffa28e8f7..79bfe73f38 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -71,7 +71,6 @@ _hash_next(IndexScanDesc scan, ScanDirection dir)
 			if (BlockNumberIsValid(blkno))
 			{
 				buf = _hash_getbuf(rel, blkno, HASH_READ, LH_OVERFLOW_PAGE);
-				TestForOldSnapshot(scan->xs_snapshot, rel, BufferGetPage(buf));
 				if (!_hash_readpage(scan, &buf, dir))
 					end_of_scan = true;
 			}
@@ -91,7 +90,6 @@ _hash_next(IndexScanDesc scan, ScanDirection dir)
 			{
 				buf = _hash_getbuf(rel, blkno, HASH_READ,
 								   LH_BUCKET_PAGE | LH_OVERFLOW_PAGE);
-				TestForOldSnapshot(scan->xs_snapshot, rel, BufferGetPage(buf));
 
 				/*
 				 * We always maintain the pin on bucket page for whole scan
@@ -186,7 +184,6 @@ _hash_readnext(IndexScanDesc scan,
 	if (block_found)
 	{
 		*pagep = BufferGetPage(*bufp);
-		TestForOldSnapshot(scan->xs_snapshot, rel, *pagep);
 		*opaquep = (HashPageOpaque) PageGetSpecialPointer(*pagep);
 	}
 }
@@ -232,7 +229,6 @@ _hash_readprev(IndexScanDesc scan,
 		*bufp = _hash_getbuf(rel, blkno, HASH_READ,
 							 LH_BUCKET_PAGE | LH_OVERFLOW_PAGE);
 		*pagep = BufferGetPage(*bufp);
-		TestForOldSnapshot(scan->xs_snapshot, rel, *pagep);
 		*opaquep = (HashPageOpaque) PageGetSpecialPointer(*pagep);
 
 		/*
@@ -351,7 +347,6 @@ _hash_first(IndexScanDesc scan, ScanDirection dir)
 	buf = _hash_getbucketbuf_from_hashkey(rel, hashkey, HASH_READ, NULL);
 	PredicateLockPage(rel, BufferGetBlockNumber(buf), scan->xs_snapshot);
 	page = BufferGetPage(buf);
-	TestForOldSnapshot(scan->xs_snapshot, rel, page);
 	opaque = (HashPageOpaque) PageGetSpecialPointer(page);
 	bucket = opaque->hasho_bucket;
 
@@ -387,7 +382,6 @@ _hash_first(IndexScanDesc scan, ScanDirection dir)
 		LockBuffer(buf, BUFFER_LOCK_UNLOCK);
 
 		old_buf = _hash_getbuf(rel, old_blkno, HASH_READ, LH_BUCKET_PAGE);
-		TestForOldSnapshot(scan->xs_snapshot, rel, BufferGetPage(old_buf));
 
 		/*
 		 * remember the split bucket buffer so as to use it later for
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 2433998f39..992c339938 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -420,7 +420,6 @@ heapgetpage(TableScanDesc sscan, BlockNumber page)
 	LockBuffer(buffer, BUFFER_LOCK_SHARE);
 
 	dp = BufferGetPage(buffer);
-	TestForOldSnapshot(snapshot, scan->rs_base.rs_rd, dp);
 	lines = PageGetMaxOffsetNumber(dp);
 	ntup = 0;
 
@@ -573,7 +572,6 @@ heapgettup(HeapScanDesc scan,
 		LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE);
 
 		dp = BufferGetPage(scan->rs_cbuf);
-		TestForOldSnapshot(snapshot, scan->rs_base.rs_rd, dp);
 		lines = PageGetMaxOffsetNumber(dp);
 		/* page and lineoff now reference the physically next tid */
 
@@ -625,7 +623,6 @@ heapgettup(HeapScanDesc scan,
 		LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE);
 
 		dp = BufferGetPage(scan->rs_cbuf);
-		TestForOldSnapshot(snapshot, scan->rs_base.rs_rd, dp);
 		lines = PageGetMaxOffsetNumber(dp);
 
 		if (!scan->rs_inited)
@@ -667,7 +664,6 @@ heapgettup(HeapScanDesc scan,
 
 		/* Since the tuple was previously fetched, needn't lock page here */
 		dp = BufferGetPage(scan->rs_cbuf);
-		TestForOldSnapshot(snapshot, scan->rs_base.rs_rd, dp);
 		lineoff = ItemPointerGetOffsetNumber(&(tuple->t_self));
 		lpp = PageGetItemId(dp, lineoff);
 		Assert(ItemIdIsNormal(lpp));
@@ -811,7 +807,6 @@ heapgettup(HeapScanDesc scan,
 		LockBuffer(scan->rs_cbuf, BUFFER_LOCK_SHARE);
 
 		dp = BufferGetPage(scan->rs_cbuf);
-		TestForOldSnapshot(snapshot, scan->rs_base.rs_rd, dp);
 		lines = PageGetMaxOffsetNumber((Page) dp);
 		linesleft = lines;
 		if (backward)
@@ -908,7 +903,6 @@ heapgettup_pagemode(HeapScanDesc scan,
 		}
 
 		dp = BufferGetPage(scan->rs_cbuf);
-		TestForOldSnapshot(scan->rs_base.rs_snapshot, scan->rs_base.rs_rd, dp);
 		lines = scan->rs_ntuples;
 		/* page and lineindex now reference the next visible tid */
 
@@ -958,7 +952,6 @@ heapgettup_pagemode(HeapScanDesc scan,
 		}
 
 		dp = BufferGetPage(scan->rs_cbuf);
-		TestForOldSnapshot(scan->rs_base.rs_snapshot, scan->rs_base.rs_rd, dp);
 		lines = scan->rs_ntuples;
 
 		if (!scan->rs_inited)
@@ -992,7 +985,6 @@ heapgettup_pagemode(HeapScanDesc scan,
 
 		/* Since the tuple was previously fetched, needn't lock page here */
 		dp = BufferGetPage(scan->rs_cbuf);
-		TestForOldSnapshot(scan->rs_base.rs_snapshot, scan->rs_base.rs_rd, dp);
 		lineoff = ItemPointerGetOffsetNumber(&(tuple->t_self));
 		lpp = PageGetItemId(dp, lineoff);
 		Assert(ItemIdIsNormal(lpp));
@@ -1118,7 +1110,6 @@ heapgettup_pagemode(HeapScanDesc scan,
 		heapgetpage((TableScanDesc) scan, page);
 
 		dp = BufferGetPage(scan->rs_cbuf);
-		TestForOldSnapshot(scan->rs_base.rs_snapshot, scan->rs_base.rs_rd, dp);
 		lines = scan->rs_ntuples;
 		linesleft = lines;
 		if (backward)
@@ -1614,7 +1605,6 @@ heap_fetch(Relation relation,
 	 */
 	LockBuffer(buffer, BUFFER_LOCK_SHARE);
 	page = BufferGetPage(buffer);
-	TestForOldSnapshot(snapshot, relation, page);
 
 	/*
 	 * We'd better check for out-of-range offnum in case of VACUUM since the
@@ -1900,7 +1890,6 @@ heap_get_latest_tid(TableScanDesc sscan,
 		buffer = ReadBuffer(relation, ItemPointerGetBlockNumber(&ctid));
 		LockBuffer(buffer, BUFFER_LOCK_SHARE);
 		page = BufferGetPage(buffer);
-		TestForOldSnapshot(snapshot, relation, page);
 
 		/*
 		 * Check for bogus item number.  This is not treated as an error
diff --git a/src/backend/access/heap/heapam_visibility.c b/src/backend/access/heap/heapam_visibility.c
index d3c57cd16a..f854db935a 100644
--- a/src/backend/access/heap/heapam_visibility.c
+++ b/src/backend/access/heap/heapam_visibility.c
@@ -1765,6 +1765,8 @@ HeapTupleSatisfiesHistoricMVCC(HeapTuple htup, Snapshot snapshot,
 bool
 HeapTupleSatisfiesVisibility(HeapTuple tup, Snapshot snapshot, Buffer buffer)
 {
+	CheckForOldSnapshot(snapshot);
+
 	switch (snapshot->snapshot_type)
 	{
 		case SNAPSHOT_MVCC:
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 15ca1b304a..10f6b53706 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -35,18 +35,6 @@ typedef struct
 	/* tuple visibility test, initialized for the relation */
 	GlobalVisState *vistest;
 
-	/*
-	 * Thresholds set by TransactionIdLimitedForOldSnapshots() if they have
-	 * been computed (done on demand, and only if
-	 * OldSnapshotThresholdActive()). The first time a tuple is about to be
-	 * removed based on the limited horizon, old_snap_used is set to true, and
-	 * SetOldSnapshotThresholdTimestamp() is called. See
-	 * heap_prune_satisfies_vacuum().
-	 */
-	TimestampTz old_snap_ts;
-	TransactionId old_snap_xmin;
-	bool		old_snap_used;
-
 	TransactionId new_prune_xid;	/* new prune hint value for page */
 	TransactionId latestRemovedXid; /* latest xid to be removed by this prune */
 	int			nredirected;	/* numbers of entries in arrays below */
@@ -90,7 +78,6 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 	TransactionId prune_xid;
 	GlobalVisState *vistest;
 	TransactionId limited_xmin = InvalidTransactionId;
-	TimestampTz limited_ts = 0;
 	Size		minfree;
 
 	/*
@@ -101,15 +88,6 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 	if (RecoveryInProgress())
 		return;
 
-	/*
-	 * XXX: Magic to keep old_snapshot_threshold tests appear "working". They
-	 * currently are broken, and discussion of what to do about them is
-	 * ongoing. See
-	 * https://www.postgresql.org/message-id/20200403001235.e6jfdll3gh2ygbuc%40alap3.anarazel.de
-	 */
-	if (old_snapshot_threshold == 0)
-		SnapshotTooOldMagicForTest();
-
 	/*
 	 * First check whether there's any chance there's something to prune,
 	 * determining the appropriate horizon is a waste if there's no prune_xid
@@ -129,12 +107,6 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 	 * consumed between this point and acquiring the lock).  This allows us to
 	 * save significant overhead in the case where the page is found not to be
 	 * prunable.
-	 *
-	 * Even if old_snapshot_threshold is set, we first check whether the page
-	 * can be pruned without. Both because
-	 * TransactionIdLimitedForOldSnapshots() is not cheap, and because not
-	 * unnecessarily relying on old_snapshot_threshold avoids causing
-	 * conflicts.
 	 */
 	vistest = GlobalVisTestFor(relation);
 
@@ -143,11 +115,6 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 		if (!OldSnapshotThresholdActive())
 			return;
 
-		if (!TransactionIdLimitedForOldSnapshots(GlobalVisTestNonRemovableHorizon(vistest),
-												 relation,
-												 &limited_xmin, &limited_ts))
-			return;
-
 		if (!TransactionIdPrecedes(prune_xid, limited_xmin))
 			return;
 	}
@@ -184,7 +151,6 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 		{
 			/* OK to prune */
 			(void) heap_page_prune(relation, buffer, vistest,
-								   limited_xmin, limited_ts,
 								   true, NULL);
 		}
 
@@ -201,9 +167,7 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
  *
  * vistest is used to distinguish whether tuples are DEAD or RECENTLY_DEAD
  * (see heap_prune_satisfies_vacuum and
- * HeapTupleSatisfiesVacuum). old_snap_xmin / old_snap_ts need to
- * either have been set by TransactionIdLimitedForOldSnapshots, or
- * InvalidTransactionId/0 respectively.
+ * HeapTupleSatisfiesVacuum).
  *
  * If report_stats is true then we send the number of reclaimed heap-only
  * tuples to pgstats.  (This must be false during vacuum, since vacuum will
@@ -218,8 +182,6 @@ heap_page_prune_opt(Relation relation, Buffer buffer)
 int
 heap_page_prune(Relation relation, Buffer buffer,
 				GlobalVisState *vistest,
-				TransactionId old_snap_xmin,
-				TimestampTz old_snap_ts,
 				bool report_stats,
 				OffsetNumber *off_loc)
 {
@@ -243,9 +205,6 @@ heap_page_prune(Relation relation, Buffer buffer,
 	prstate.new_prune_xid = InvalidTransactionId;
 	prstate.rel = relation;
 	prstate.vistest = vistest;
-	prstate.old_snap_xmin = old_snap_xmin;
-	prstate.old_snap_ts = old_snap_ts;
-	prstate.old_snap_used = false;
 	prstate.latestRemovedXid = InvalidTransactionId;
 	prstate.nredirected = prstate.ndead = prstate.nunused = 0;
 	memset(prstate.marked, 0, sizeof(prstate.marked));
@@ -409,14 +368,6 @@ heap_page_prune(Relation relation, Buffer buffer,
  * because of old_snapshot_threshold. We only want to increase the threshold
  * that triggers errors for old snapshots when we actually decide to remove a
  * row based on the limited horizon.
- *
- * Due to its cost we also only want to call
- * TransactionIdLimitedForOldSnapshots() if necessary, i.e. we might not have
- * done so in heap_hot_prune_opt() if pd_prune_xid was old enough. But we
- * still want to be able to remove rows that are too new to be removed
- * according to prstate->vistest, but that can be removed based on
- * old_snapshot_threshold. So we call TransactionIdLimitedForOldSnapshots() on
- * demand in here, if appropriate.
  */
 static HTSV_Result
 heap_prune_satisfies_vacuum(PruneState *prstate, HeapTuple tup, Buffer buffer)
@@ -430,52 +381,11 @@ heap_prune_satisfies_vacuum(PruneState *prstate, HeapTuple tup, Buffer buffer)
 		return res;
 
 	/*
-	 * If we are already relying on the limited xmin, there is no need to
-	 * delay doing so anymore.
-	 */
-	if (prstate->old_snap_used)
-	{
-		Assert(TransactionIdIsValid(prstate->old_snap_xmin));
-
-		if (TransactionIdPrecedes(dead_after, prstate->old_snap_xmin))
-			res = HEAPTUPLE_DEAD;
-		return res;
-	}
-
-	/*
-	 * First check if GlobalVisTestIsRemovableXid() is sufficient to find the
-	 * row dead. If not, and old_snapshot_threshold is enabled, try to use the
-	 * lowered horizon.
+	 * Check if GlobalVisTestIsRemovableXid() is sufficient to find the row
+	 * dead.
 	 */
 	if (GlobalVisTestIsRemovableXid(prstate->vistest, dead_after))
 		res = HEAPTUPLE_DEAD;
-	else if (OldSnapshotThresholdActive())
-	{
-		/* haven't determined limited horizon yet, requests */
-		if (!TransactionIdIsValid(prstate->old_snap_xmin))
-		{
-			TransactionId horizon =
-			GlobalVisTestNonRemovableHorizon(prstate->vistest);
-
-			TransactionIdLimitedForOldSnapshots(horizon, prstate->rel,
-												&prstate->old_snap_xmin,
-												&prstate->old_snap_ts);
-		}
-
-		if (TransactionIdIsValid(prstate->old_snap_xmin) &&
-			TransactionIdPrecedes(dead_after, prstate->old_snap_xmin))
-		{
-			/*
-			 * About to remove row based on snapshot_too_old. Need to raise
-			 * the threshold so problematic accesses would error.
-			 */
-			Assert(!prstate->old_snap_used);
-			SetOldSnapshotThresholdTimestamp(prstate->old_snap_ts,
-											 prstate->old_snap_xmin);
-			prstate->old_snap_used = true;
-			res = HEAPTUPLE_DEAD;
-		}
-	}
 
 	return res;
 }
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 88db2e2cfc..52f49f6a58 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1718,9 +1718,7 @@ retry:
 	 * lpdead_items's final value can be thought of as the number of tuples
 	 * that were deleted from indexes.
 	 */
-	tuples_deleted = heap_page_prune(rel, buf, vistest,
-									 InvalidTransactionId, 0, false,
-									 &vacrel->offnum);
+	tuples_deleted = heap_page_prune(rel, buf, vistest, false, &vacrel->offnum);
 
 	/*
 	 * Now scan the page to collect LP_DEAD items and check for tuples
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index d1177d8772..8438931b26 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -272,7 +272,6 @@ _bt_moveright(Relation rel,
 	for (;;)
 	{
 		page = BufferGetPage(buf);
-		TestForOldSnapshot(snapshot, rel, page);
 		opaque = (BTPageOpaque) PageGetSpecialPointer(page);
 
 		if (P_RIGHTMOST(opaque))
@@ -1994,7 +1993,6 @@ _bt_readnextpage(IndexScanDesc scan, BlockNumber blkno, ScanDirection dir)
 			/* step right one page */
 			so->currPos.buf = _bt_getbuf(rel, blkno, BT_READ);
 			page = BufferGetPage(so->currPos.buf);
-			TestForOldSnapshot(scan->xs_snapshot, rel, page);
 			opaque = (BTPageOpaque) PageGetSpecialPointer(page);
 			/* check for deleted page */
 			if (!P_IGNORE(opaque))
@@ -2097,7 +2095,6 @@ _bt_readnextpage(IndexScanDesc scan, BlockNumber blkno, ScanDirection dir)
 			 * and do it all again.
 			 */
 			page = BufferGetPage(so->currPos.buf);
-			TestForOldSnapshot(scan->xs_snapshot, rel, page);
 			opaque = (BTPageOpaque) PageGetSpecialPointer(page);
 			if (!P_IGNORE(opaque))
 			{
@@ -2203,7 +2200,6 @@ _bt_walk_left(Relation rel, Buffer buf, Snapshot snapshot)
 		CHECK_FOR_INTERRUPTS();
 		buf = _bt_getbuf(rel, blkno, BT_READ);
 		page = BufferGetPage(buf);
-		TestForOldSnapshot(snapshot, rel, page);
 		opaque = (BTPageOpaque) PageGetSpecialPointer(page);
 
 		/*
@@ -2230,14 +2226,12 @@ _bt_walk_left(Relation rel, Buffer buf, Snapshot snapshot)
 			blkno = opaque->btpo_next;
 			buf = _bt_relandgetbuf(rel, buf, blkno, BT_READ);
 			page = BufferGetPage(buf);
-			TestForOldSnapshot(snapshot, rel, page);
 			opaque = (BTPageOpaque) PageGetSpecialPointer(page);
 		}
 
 		/* Return to the original page to see what's up */
 		buf = _bt_relandgetbuf(rel, buf, obknum, BT_READ);
 		page = BufferGetPage(buf);
-		TestForOldSnapshot(snapshot, rel, page);
 		opaque = (BTPageOpaque) PageGetSpecialPointer(page);
 		if (P_ISDELETED(opaque))
 		{
@@ -2255,7 +2249,6 @@ _bt_walk_left(Relation rel, Buffer buf, Snapshot snapshot)
 				blkno = opaque->btpo_next;
 				buf = _bt_relandgetbuf(rel, buf, blkno, BT_READ);
 				page = BufferGetPage(buf);
-				TestForOldSnapshot(snapshot, rel, page);
 				opaque = (BTPageOpaque) PageGetSpecialPointer(page);
 				if (!P_ISDELETED(opaque))
 					break;
@@ -2316,7 +2309,6 @@ _bt_get_endpoint(Relation rel, uint32 level, bool rightmost,
 		return InvalidBuffer;
 
 	page = BufferGetPage(buf);
-	TestForOldSnapshot(snapshot, rel, page);
 	opaque = (BTPageOpaque) PageGetSpecialPointer(page);
 
 	for (;;)
@@ -2336,7 +2328,6 @@ _bt_get_endpoint(Relation rel, uint32 level, bool rightmost,
 					 RelationGetRelationName(rel));
 			buf = _bt_relandgetbuf(rel, buf, blkno, BT_READ);
 			page = BufferGetPage(buf);
-			TestForOldSnapshot(snapshot, rel, page);
 			opaque = (BTPageOpaque) PageGetSpecialPointer(page);
 		}
 
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index e14b9fa573..4fa7da22f3 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -858,7 +858,6 @@ redirect:
 			/* else new pointer points to the same page, no work needed */
 
 			page = BufferGetPage(buffer);
-			TestForOldSnapshot(snapshot, index, page);
 
 			isnull = SpGistPageStoresNulls(page) ? true : false;
 
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 50b7a16bce..52345c0cb7 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2992,8 +2992,7 @@ index_build(Relation heapRelation,
 	/*
 	 * If we found any potentially broken HOT chains, mark the index as not
 	 * being usable until the current transaction is below the event horizon.
-	 * See src/backend/access/heap/README.HOT for discussion.  Also set this
-	 * if early pruning/vacuuming is enabled for the heap relation.  While it
+	 * See src/backend/access/heap/README.HOT for discussion.  While it
 	 * might become safe to use the index earlier based on actual cleanup
 	 * activity and other active transactions, the test for that would be much
 	 * more complex and would require some form of blocking, so keep it simple
@@ -3009,7 +3008,7 @@ index_build(Relation heapRelation,
 	 *
 	 * We also need not set indcheckxmin during a concurrent index build,
 	 * because we won't set indisvalid true until all transactions that care
-	 * about the broken HOT chains or early pruning/vacuuming are gone.
+	 * about the broken HOT chains are gone.
 	 *
 	 * Therefore, this code path can only be taken during non-concurrent
 	 * CREATE INDEX.  Thus the fact that heap_update will set the pg_index
@@ -3018,7 +3017,7 @@ index_build(Relation heapRelation,
 	 * about any concurrent readers of the tuple; no other transaction can see
 	 * it yet.
 	 */
-	if ((indexInfo->ii_BrokenHotChain || EarlyPruningEnabled(heapRelation)) &&
+	if ((indexInfo->ii_BrokenHotChain) &&
 		!isreindex &&
 		!indexInfo->ii_Concurrent)
 	{
@@ -3702,7 +3701,6 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
 		HeapTuple	indexTuple;
 		Form_pg_index indexForm;
 		bool		index_bad;
-		bool		early_pruning_enabled = EarlyPruningEnabled(heapRelation);
 
 		pg_index = table_open(IndexRelationId, RowExclusiveLock);
 
@@ -3716,12 +3714,11 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
 					 !indexForm->indisready ||
 					 !indexForm->indislive);
 		if (index_bad ||
-			(indexForm->indcheckxmin && !indexInfo->ii_BrokenHotChain) ||
-			early_pruning_enabled)
+			(indexForm->indcheckxmin && !indexInfo->ii_BrokenHotChain))
 		{
-			if (!indexInfo->ii_BrokenHotChain && !early_pruning_enabled)
+			if (!indexInfo->ii_BrokenHotChain)
 				indexForm->indcheckxmin = false;
-			else if (index_bad || early_pruning_enabled)
+			else if (index_bad)
 				indexForm->indcheckxmin = true;
 			indexForm->indisvalid = true;
 			indexForm->indisready = true;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7421d7cfbd..4e8088ae45 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -971,25 +971,6 @@ vacuum_set_xid_limits(Relation rel,
 	 */
 	*oldestXmin = GetOldestNonRemovableTransactionId(rel);
 
-	if (OldSnapshotThresholdActive())
-	{
-		TransactionId limit_xmin;
-		TimestampTz limit_ts;
-
-		if (TransactionIdLimitedForOldSnapshots(*oldestXmin, rel,
-												&limit_xmin, &limit_ts))
-		{
-			/*
-			 * TODO: We should only set the threshold if we are pruning on the
-			 * basis of the increased limits.  Not as crucial here as it is
-			 * for opportunistic pruning (which often happens at a much higher
-			 * frequency), but would still be a significant improvement.
-			 */
-			SetOldSnapshotThresholdTimestamp(limit_ts, limit_xmin);
-			*oldestXmin = limit_xmin;
-		}
-	}
-
 	Assert(TransactionIdIsNormal(*oldestXmin));
 
 	/*
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index 2db1914aff..83324a30d7 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -223,6 +223,12 @@ BitmapHeapNext(BitmapHeapScanState *node)
 
 			if (skip_fetch)
 			{
+				/*
+				 * if we relied on VM_ALL_VISIBLE, our snapshot had better
+				 * not have timed out
+				 */
+				CheckForOldSnapshot(node->ss.ps.state->es_snapshot);
+
 				/* can't be lossy in the skip_fetch case */
 				Assert(tbmres->ntuples >= 0);
 
diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c
index 0754e28a9a..d6e8fa610b 100644
--- a/src/backend/executor/nodeIndexonlyscan.c
+++ b/src/backend/executor/nodeIndexonlyscan.c
@@ -158,6 +158,7 @@ IndexOnlyNext(IndexOnlyScanState *node)
 		 * It's worth going through this complexity to avoid needing to lock
 		 * the VM buffer, which could cause significant contention.
 		 */
+		CheckForOldSnapshot(scandesc->xs_snapshot);
 		if (!VM_ALL_VISIBLE(scandesc->heapRelation,
 							ItemPointerGetBlockNumber(tid),
 							&node->ioss_VMBuffer))
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 4b296a22c4..f61aa0c2fd 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -4852,20 +4852,3 @@ IssuePendingWritebacks(WritebackContext *context)
 
 	context->nr_pending = 0;
 }
-
-
-/*
- * Implement slower/larger portions of TestForOldSnapshot
- *
- * Smaller/faster portions are put inline, but the entire set of logic is too
- * big for that.
- */
-void
-TestForOldSnapshot_impl(Snapshot snapshot, Relation relation)
-{
-	if (RelationAllowsEarlyPruning(relation)
-		&& (snapshot)->whenTaken < GetOldSnapshotThresholdTimestamp())
-		ereport(ERROR,
-				(errcode(ERRCODE_SNAPSHOT_TOO_OLD),
-				 errmsg("snapshot too old")));
-}
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 3e4ec53a97..81c631a603 100644
--- a/src/backend/storage/ipc/ipci.c
+++ b/src/backend/storage/ipc/ipci.c
@@ -146,7 +146,6 @@ CreateSharedMemoryAndSemaphores(void)
 		size = add_size(size, WalRcvShmemSize());
 		size = add_size(size, PgArchShmemSize());
 		size = add_size(size, ApplyLauncherShmemSize());
-		size = add_size(size, SnapMgrShmemSize());
 		size = add_size(size, BTreeShmemSize());
 		size = add_size(size, SyncScanShmemSize());
 		size = add_size(size, AsyncShmemSize());
@@ -265,7 +264,6 @@ CreateSharedMemoryAndSemaphores(void)
 	/*
 	 * Set up other modules that need some shared memory space
 	 */
-	SnapMgrInit();
 	BTreeShmemInit();
 	SyncScanShmemInit();
 	AsyncShmemInit();
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 085bd1e407..f5f0741312 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -2039,34 +2039,6 @@ GetMaxSnapshotSubxidCount(void)
 	return TOTAL_MAX_CACHED_SUBXIDS;
 }
 
-/*
- * Initialize old_snapshot_threshold specific parts of a newly build snapshot.
- */
-static void
-GetSnapshotDataInitOldSnapshot(Snapshot snapshot)
-{
-	if (!OldSnapshotThresholdActive())
-	{
-		/*
-		 * If not using "snapshot too old" feature, fill related fields with
-		 * dummy values that don't require any locking.
-		 */
-		snapshot->lsn = InvalidXLogRecPtr;
-		snapshot->whenTaken = 0;
-	}
-	else
-	{
-		/*
-		 * Capture the current time and WAL stream location in case this
-		 * snapshot becomes old enough to need to fall back on the special
-		 * "old snapshot" logic.
-		 */
-		snapshot->lsn = GetXLogInsertRecPtr();
-		snapshot->whenTaken = GetSnapshotCurrentTimestamp();
-		MaintainOldSnapshotTimeMapping(snapshot->whenTaken, snapshot->xmin);
-	}
-}
-
 /*
  * Helper function for GetSnapshotData() that checks if the bulk of the
  * visibility information in the snapshot is still valid. If so, it updates
@@ -2120,8 +2092,8 @@ GetSnapshotDataReuse(Snapshot snapshot)
 	snapshot->active_count = 0;
 	snapshot->regd_count = 0;
 	snapshot->copied = false;
-
-	GetSnapshotDataInitOldSnapshot(snapshot);
+	snapshot->whenTaken = 0;		/* set by EnableSnapshotTimeout() */
+	snapshot->timeout_state = SNAPSHOT_TIMEOUT_NONE;
 
 	return true;
 }
@@ -2506,8 +2478,8 @@ GetSnapshotData(Snapshot snapshot)
 	snapshot->active_count = 0;
 	snapshot->regd_count = 0;
 	snapshot->copied = false;
-
-	GetSnapshotDataInitOldSnapshot(snapshot);
+	snapshot->whenTaken = 0;		/* set by EnableSnapshotTimeout() */
+	snapshot->timeout_state = SNAPSHOT_TIMEOUT_NONE;
 
 	return snapshot;
 }
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8cea10c901..c40a293ccd 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3358,6 +3358,12 @@ ProcessInterrupts(void)
 			IdleSessionTimeoutPending = false;
 	}
 
+	if (SnapshotTimeoutPending)
+	{
+		HandleSnapshotTimeout();
+		SnapshotTimeoutPending = false;
+	}
+
 	if (ProcSignalBarrierPending)
 		ProcessProcSignalBarrier();
 
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 381d9e548d..7f687504af 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -34,6 +34,7 @@ volatile sig_atomic_t CheckClientConnectionPending = false;
 volatile sig_atomic_t ClientConnectionLost = false;
 volatile sig_atomic_t IdleInTransactionSessionTimeoutPending = false;
 volatile sig_atomic_t IdleSessionTimeoutPending = false;
+volatile sig_atomic_t SnapshotTimeoutPending = false;
 volatile sig_atomic_t ProcSignalBarrierPending = false;
 volatile sig_atomic_t LogMemoryContextPending = false;
 volatile uint32 InterruptHoldoffCount = 0;
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 51d1bbef30..6093599b04 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -74,6 +74,7 @@ static void LockTimeoutHandler(void);
 static void IdleInTransactionSessionTimeoutHandler(void);
 static void IdleSessionTimeoutHandler(void);
 static void ClientCheckTimeoutHandler(void);
+static void SnapshotTimeoutHandler(void);
 static bool ThereIsAtLeastOneRole(void);
 static void process_startup_options(Port *port, bool am_superuser);
 static void process_settings(Oid databaseid, Oid roleid);
@@ -622,6 +623,7 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
 						IdleInTransactionSessionTimeoutHandler);
 		RegisterTimeout(IDLE_SESSION_TIMEOUT, IdleSessionTimeoutHandler);
 		RegisterTimeout(CLIENT_CONNECTION_CHECK_TIMEOUT, ClientCheckTimeoutHandler);
+		RegisterTimeout(SNAPSHOT_TIMEOUT, SnapshotTimeoutHandler);
 	}
 
 	/*
@@ -1256,6 +1258,14 @@ ClientCheckTimeoutHandler(void)
 	SetLatch(MyLatch);
 }
 
+static void
+SnapshotTimeoutHandler(void)
+{
+	SnapshotTimeoutPending = true;
+	InterruptPending = true;
+	SetLatch(MyLatch);
+}
+
 /*
  * Returns true if at least one role is defined in this database cluster.
  */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 68b62d523d..f87c53462a 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3341,10 +3341,10 @@ static struct config_int ConfigureNamesInt[] =
 		{"old_snapshot_threshold", PGC_POSTMASTER, RESOURCES_ASYNCHRONOUS,
 			gettext_noop("Time before a snapshot is too old to read pages changed after the snapshot was taken."),
 			gettext_noop("A value of -1 disables this feature."),
-			GUC_UNIT_MIN
+			GUC_UNIT_MS
 		},
 		&old_snapshot_threshold,
-		-1, -1, MINS_PER_HOUR * HOURS_PER_DAY * 60,
+		-1, -1, 1000 * SECS_PER_MINUTE * MINS_PER_HOUR * HOURS_PER_DAY * 7,
 		NULL, NULL, NULL
 	},
 
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 2968c7f7b7..4c0a41e79e 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -54,6 +54,7 @@
 #include "access/xlog.h"
 #include "catalog/catalog.h"
 #include "datatype/timestamp.h"
+#include "lib/ilist.h"
 #include "lib/pairingheap.h"
 #include "miscadmin.h"
 #include "storage/predicate.h"
@@ -70,15 +71,13 @@
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
 #include "utils/timestamp.h"
+#include "utils/timeout.h"
 
 
 /*
  * GUC parameters
  */
-int			old_snapshot_threshold; /* number of minutes, -1 disables */
-
-volatile OldSnapshotControlData *oldSnapshotControl;
-
+int			old_snapshot_threshold; /* number of ms, -1 disables */
 
 /*
  * CurrentSnapshot points to the only snapshot taken in transaction-snapshot
@@ -145,6 +144,15 @@ static int	xmin_cmp(const pairingheap_node *a, const pairingheap_node *b,
 
 static pairingheap RegisteredSnapshots = {&xmin_cmp, NULL, NULL};
 
+/*
+ * Snapshots ordered by xmin, when using the snapshot too old feature.  We
+ * could use a pairing heap on (xid, whenTaken), but it seems more efficient
+ * to just use a dlist and adjust for rare cases when xids arrive out of
+ * order.
+ */
+static dlist_head TimedSnapshots = DLIST_STATIC_INIT(TimedSnapshots);
+static inline bool SnapshotInTimedSnapshots(Snapshot snapshot);
+
 /* first GetTransactionSnapshot call in a transaction? */
 bool		FirstSnapshotSet = false;
 
@@ -169,10 +177,12 @@ typedef struct ExportedSnapshot
 static List *exportedSnapshots = NIL;
 
 /* Prototypes for local functions */
-static TimestampTz AlignTimestampToMinuteBoundary(TimestampTz ts);
 static Snapshot CopySnapshot(Snapshot snapshot);
 static void FreeSnapshot(Snapshot snapshot);
 static void SnapshotResetXmin(void);
+static void EnableSnapshotTimeout(Snapshot snapshot, bool block);
+static bool ForgetSnapshotTimeout(Snapshot snapshot);
+static void InsertIntoTimedSnapshots(Snapshot snapshot);
 
 /*
  * Snapshot fields to be serialized.
@@ -189,54 +199,9 @@ typedef struct SerializedSnapshotData
 	bool		suboverflowed;
 	bool		takenDuringRecovery;
 	CommandId	curcid;
-	TimestampTz whenTaken;
-	XLogRecPtr	lsn;
+	TimestampTz	whenTaken;
 } SerializedSnapshotData;
 
-Size
-SnapMgrShmemSize(void)
-{
-	Size		size;
-
-	size = offsetof(OldSnapshotControlData, xid_by_minute);
-	if (old_snapshot_threshold > 0)
-		size = add_size(size, mul_size(sizeof(TransactionId),
-									   OLD_SNAPSHOT_TIME_MAP_ENTRIES));
-
-	return size;
-}
-
-/*
- * Initialize for managing old snapshot detection.
- */
-void
-SnapMgrInit(void)
-{
-	bool		found;
-
-	/*
-	 * Create or attach to the OldSnapshotControlData structure.
-	 */
-	oldSnapshotControl = (volatile OldSnapshotControlData *)
-		ShmemInitStruct("OldSnapshotControlData",
-						SnapMgrShmemSize(), &found);
-
-	if (!found)
-	{
-		SpinLockInit(&oldSnapshotControl->mutex_current);
-		oldSnapshotControl->current_timestamp = 0;
-		SpinLockInit(&oldSnapshotControl->mutex_latest_xmin);
-		oldSnapshotControl->latest_xmin = InvalidTransactionId;
-		oldSnapshotControl->next_map_update = 0;
-		SpinLockInit(&oldSnapshotControl->mutex_threshold);
-		oldSnapshotControl->threshold_timestamp = 0;
-		oldSnapshotControl->threshold_xid = InvalidTransactionId;
-		oldSnapshotControl->head_offset = 0;
-		oldSnapshotControl->head_timestamp = 0;
-		oldSnapshotControl->count_used = 0;
-	}
-}
-
 /*
  * GetTransactionSnapshot
  *		Get the appropriate snapshot for a new query in a transaction.
@@ -271,12 +236,15 @@ GetTransactionSnapshot(void)
 		InvalidateCatalogSnapshot();
 
 		Assert(pairingheap_is_empty(&RegisteredSnapshots));
+		Assert(dlist_is_empty(&TimedSnapshots));
 		Assert(FirstXactSnapshot == NULL);
 
 		if (IsInParallelMode())
 			elog(ERROR,
 				 "cannot take query snapshot during a parallel operation");
 
+		ForgetSnapshotTimeout(&CurrentSnapshotData);
+
 		/*
 		 * In transaction-snapshot mode, the first snapshot must live until
 		 * end of xact regardless of what the caller does with it, so we must
@@ -301,6 +269,7 @@ GetTransactionSnapshot(void)
 		else
 			CurrentSnapshot = GetSnapshotData(&CurrentSnapshotData);
 
+		EnableSnapshotTimeout(CurrentSnapshot, false);
 		FirstSnapshotSet = true;
 		return CurrentSnapshot;
 	}
@@ -311,7 +280,9 @@ GetTransactionSnapshot(void)
 	/* Don't allow catalog snapshot to be older than xact snapshot. */
 	InvalidateCatalogSnapshot();
 
+	ForgetSnapshotTimeout(&CurrentSnapshotData);
 	CurrentSnapshot = GetSnapshotData(&CurrentSnapshotData);
+	EnableSnapshotTimeout(CurrentSnapshot, false);
 
 	return CurrentSnapshot;
 }
@@ -343,6 +314,7 @@ GetLatestSnapshot(void)
 		return GetTransactionSnapshot();
 
 	SecondarySnapshot = GetSnapshotData(&SecondarySnapshotData);
+	EnableSnapshotTimeout(SecondarySnapshot, false);
 
 	return SecondarySnapshot;
 }
@@ -350,29 +322,20 @@ GetLatestSnapshot(void)
 /*
  * GetOldestSnapshot
  *
- *		Get the transaction's oldest known snapshot, as judged by the LSN.
+ *		Get the transaction's oldest known snapshot.
  *		Will return NULL if there are no active or registered snapshots.
  */
 Snapshot
 GetOldestSnapshot(void)
 {
 	Snapshot	OldestRegisteredSnapshot = NULL;
-	XLogRecPtr	RegisteredLSN = InvalidXLogRecPtr;
+
+	if (OldestActiveSnapshot != NULL)
+			return OldestActiveSnapshot->as_snap;
 
 	if (!pairingheap_is_empty(&RegisteredSnapshots))
-	{
 		OldestRegisteredSnapshot = pairingheap_container(SnapshotData, ph_node,
 														 pairingheap_first(&RegisteredSnapshots));
-		RegisteredLSN = OldestRegisteredSnapshot->lsn;
-	}
-
-	if (OldestActiveSnapshot != NULL)
-	{
-		XLogRecPtr	ActiveLSN = OldestActiveSnapshot->as_snap->lsn;
-
-		if (XLogRecPtrIsInvalid(RegisteredLSN) || RegisteredLSN > ActiveLSN)
-			return OldestActiveSnapshot->as_snap;
-	}
 
 	return OldestRegisteredSnapshot;
 }
@@ -437,6 +400,17 @@ GetNonHistoricCatalogSnapshot(Oid relid)
 		 * CatalogSnapshot pointer is already valid.
 		 */
 		pairingheap_add(&RegisteredSnapshots, &CatalogSnapshot->ph_node);
+
+#if 0
+		/*
+		 * We don't allow catalog snapshots to time out, but if snapshot
+		 * timeouts are enabled they still need to appear in TimedSnapshots,
+		 * to block xmin from advancing.
+		 *
+		 * XXX But they're always >= the transaction snapshot, right?
+		 */
+		EnableSnapshotTimeout(CatalogSnapshot, true);
+#endif
 	}
 
 	return CatalogSnapshot;
@@ -457,6 +431,7 @@ InvalidateCatalogSnapshot(void)
 {
 	if (CatalogSnapshot)
 	{
+		ForgetSnapshotTimeout(CatalogSnapshot);
 		pairingheap_remove(&RegisteredSnapshots, &CatalogSnapshot->ph_node);
 		CatalogSnapshot = NULL;
 		SnapshotResetXmin();
@@ -625,6 +600,13 @@ CopySnapshot(Snapshot snapshot)
 	newsnap->copied = true;
 	newsnap->snapXactCompletionCount = 0;
 
+	/* Inherit timeout. */
+	newsnap->whenTaken = snapshot->whenTaken;
+	newsnap->timeout_state = SNAPSHOT_TIMEOUT_NONE;
+	if (SnapshotInTimedSnapshots(snapshot))
+		InsertIntoTimedSnapshots(newsnap);
+	newsnap->timeout_state = snapshot->timeout_state;
+
 	/* setup XID array */
 	if (snapshot->xcnt > 0)
 	{
@@ -661,6 +643,8 @@ CopySnapshot(Snapshot snapshot)
 static void
 FreeSnapshot(Snapshot snapshot)
 {
+	ForgetSnapshotTimeout(snapshot);
+
 	Assert(snapshot->regd_count == 0);
 	Assert(snapshot->active_count == 0);
 	Assert(snapshot->copied);
@@ -913,19 +897,36 @@ xmin_cmp(const pairingheap_node *a, const pairingheap_node *b, void *arg)
  * dropped.  For efficiency, we only consider recomputing PGPROC->xmin when
  * the active snapshot stack is empty; this allows us not to need to track
  * which active snapshot is oldest.
- *
- * Note: it's tempting to use GetOldestSnapshot() here so that we can include
- * active snapshots in the calculation.  However, that compares by LSN not
- * xmin so it's not entirely clear that it's the same thing.  Also, we'd be
- * critically dependent on the assumption that the bottommost active snapshot
- * stack entry has the oldest xmin.  (Current uses of GetOldestSnapshot() are
- * not actually critical, but this would be.)
  */
 static void
 SnapshotResetXmin(void)
 {
 	Snapshot	minSnapshot;
 
+	/*
+	 * If the snapshot timeout feature is enabled, then we take the xmin of
+	 * the oldest existing snapshot that hasn't timed out yet.
+	 *
+	 * XXX How is this scheme supposed to cope with toast snapshots?  Should
+	 * register them but that looks expensive, so maybe we should tell the
+	 * oldest snapshot about its weird sibling, so that when it times out, the
+	 * toast snapshot does too?
+	 */
+	if (OldSnapshotThresholdActive())
+	{
+		if (!dlist_is_empty(&TimedSnapshots))
+		{
+			minSnapshot = dlist_container(SnapshotData, timeout_node,
+										  dlist_head_node(&TimedSnapshots));
+			Assert(SnapshotInTimedSnapshots(minSnapshot));
+			if (TransactionIdPrecedes(MyProc->xmin, minSnapshot->xmin))
+				MyProc->xmin = minSnapshot->xmin;
+		}
+		else
+			MyProc->xmin = InvalidTransactionId;
+		return;
+	}
+
 	if (ActiveSnapshot != NULL)
 		return;
 
@@ -1019,6 +1020,7 @@ AtEOXact_Snapshot(bool isCommit, bool resetXmin)
 		Assert(FirstXactSnapshot->regd_count > 0);
 		Assert(!pairingheap_is_empty(&RegisteredSnapshots));
 		pairingheap_remove(&RegisteredSnapshots, &FirstXactSnapshot->ph_node);
+		ForgetSnapshotTimeout(FirstXactSnapshot);
 	}
 	FirstXactSnapshot = NULL;
 
@@ -1050,6 +1052,8 @@ AtEOXact_Snapshot(bool isCommit, bool resetXmin)
 
 			pairingheap_remove(&RegisteredSnapshots,
 							   &esnap->snapshot->ph_node);
+
+			ForgetSnapshotTimeout(esnap->snapshot);
 		}
 
 		exportedSnapshots = NIL;
@@ -1079,6 +1083,20 @@ AtEOXact_Snapshot(bool isCommit, bool resetXmin)
 	OldestActiveSnapshot = NULL;
 	pairingheap_reset(&RegisteredSnapshots);
 
+	/*
+	 * Make sure that the xmin is not being held back by our our static
+	 * snapshots, and that they don't think they're in TimedSnapshots.  After
+	 * that, it is safe to blow away the TimedSnapshots queue (anything left
+	 * that it points to will go away with TopTransactionContext).
+	 */
+	resetXmin |= CurrentSnapshot && ForgetSnapshotTimeout(CurrentSnapshot);
+	resetXmin |= SecondarySnapshot && ForgetSnapshotTimeout(SecondarySnapshot);
+	resetXmin |= ForgetSnapshotTimeout(&CurrentSnapshotData);
+	resetXmin |= ForgetSnapshotTimeout(&SecondarySnapshotData);
+	resetXmin |= ForgetSnapshotTimeout(&SnapshotSelfData);
+	resetXmin |= ForgetSnapshotTimeout(&SnapshotAnyData);
+	dlist_init(&TimedSnapshots);
+
 	CurrentSnapshot = NULL;
 	SecondarySnapshot = NULL;
 
@@ -1611,420 +1629,6 @@ ThereAreNoPriorRegisteredSnapshots(void)
 }
 
 
-/*
- * Return a timestamp that is exactly on a minute boundary.
- *
- * If the argument is already aligned, return that value, otherwise move to
- * the next minute boundary following the given time.
- */
-static TimestampTz
-AlignTimestampToMinuteBoundary(TimestampTz ts)
-{
-	TimestampTz retval = ts + (USECS_PER_MINUTE - 1);
-
-	return retval - (retval % USECS_PER_MINUTE);
-}
-
-/*
- * Get current timestamp for snapshots
- *
- * This is basically GetCurrentTimestamp(), but with a guarantee that
- * the result never moves backward.
- */
-TimestampTz
-GetSnapshotCurrentTimestamp(void)
-{
-	TimestampTz now = GetCurrentTimestamp();
-
-	/*
-	 * Don't let time move backward; if it hasn't advanced, use the old value.
-	 */
-	SpinLockAcquire(&oldSnapshotControl->mutex_current);
-	if (now <= oldSnapshotControl->current_timestamp)
-		now = oldSnapshotControl->current_timestamp;
-	else
-		oldSnapshotControl->current_timestamp = now;
-	SpinLockRelease(&oldSnapshotControl->mutex_current);
-
-	return now;
-}
-
-/*
- * Get timestamp through which vacuum may have processed based on last stored
- * value for threshold_timestamp.
- *
- * XXX: So far, we never trust that a 64-bit value can be read atomically; if
- * that ever changes, we could get rid of the spinlock here.
- */
-TimestampTz
-GetOldSnapshotThresholdTimestamp(void)
-{
-	TimestampTz threshold_timestamp;
-
-	SpinLockAcquire(&oldSnapshotControl->mutex_threshold);
-	threshold_timestamp = oldSnapshotControl->threshold_timestamp;
-	SpinLockRelease(&oldSnapshotControl->mutex_threshold);
-
-	return threshold_timestamp;
-}
-
-void
-SetOldSnapshotThresholdTimestamp(TimestampTz ts, TransactionId xlimit)
-{
-	SpinLockAcquire(&oldSnapshotControl->mutex_threshold);
-	Assert(oldSnapshotControl->threshold_timestamp <= ts);
-	Assert(TransactionIdPrecedesOrEquals(oldSnapshotControl->threshold_xid, xlimit));
-	oldSnapshotControl->threshold_timestamp = ts;
-	oldSnapshotControl->threshold_xid = xlimit;
-	SpinLockRelease(&oldSnapshotControl->mutex_threshold);
-}
-
-/*
- * XXX: Magic to keep old_snapshot_threshold tests appear "working". They
- * currently are broken, and discussion of what to do about them is
- * ongoing. See
- * https://www.postgresql.org/message-id/20200403001235.e6jfdll3gh2ygbuc%40alap3.anarazel.de
- */
-void
-SnapshotTooOldMagicForTest(void)
-{
-	TimestampTz ts = GetSnapshotCurrentTimestamp();
-
-	Assert(old_snapshot_threshold == 0);
-
-	ts -= 5 * USECS_PER_SEC;
-
-	SpinLockAcquire(&oldSnapshotControl->mutex_threshold);
-	oldSnapshotControl->threshold_timestamp = ts;
-	SpinLockRelease(&oldSnapshotControl->mutex_threshold);
-}
-
-/*
- * If there is a valid mapping for the timestamp, set *xlimitp to
- * that. Returns whether there is such a mapping.
- */
-static bool
-GetOldSnapshotFromTimeMapping(TimestampTz ts, TransactionId *xlimitp)
-{
-	bool		in_mapping = false;
-
-	Assert(ts == AlignTimestampToMinuteBoundary(ts));
-
-	LWLockAcquire(OldSnapshotTimeMapLock, LW_SHARED);
-
-	if (oldSnapshotControl->count_used > 0
-		&& ts >= oldSnapshotControl->head_timestamp)
-	{
-		int			offset;
-
-		offset = ((ts - oldSnapshotControl->head_timestamp)
-				  / USECS_PER_MINUTE);
-		if (offset > oldSnapshotControl->count_used - 1)
-			offset = oldSnapshotControl->count_used - 1;
-		offset = (oldSnapshotControl->head_offset + offset)
-			% OLD_SNAPSHOT_TIME_MAP_ENTRIES;
-
-		*xlimitp = oldSnapshotControl->xid_by_minute[offset];
-
-		in_mapping = true;
-	}
-
-	LWLockRelease(OldSnapshotTimeMapLock);
-
-	return in_mapping;
-}
-
-/*
- * TransactionIdLimitedForOldSnapshots
- *
- * Apply old snapshot limit.  This is intended to be called for page pruning
- * and table vacuuming, to allow old_snapshot_threshold to override the normal
- * global xmin value.  Actual testing for snapshot too old will be based on
- * whether a snapshot timestamp is prior to the threshold timestamp set in
- * this function.
- *
- * If the limited horizon allows a cleanup action that otherwise would not be
- * possible, SetOldSnapshotThresholdTimestamp(*limit_ts, *limit_xid) needs to
- * be called before that cleanup action.
- */
-bool
-TransactionIdLimitedForOldSnapshots(TransactionId recentXmin,
-									Relation relation,
-									TransactionId *limit_xid,
-									TimestampTz *limit_ts)
-{
-	TimestampTz ts;
-	TransactionId xlimit = recentXmin;
-	TransactionId latest_xmin;
-	TimestampTz next_map_update_ts;
-	TransactionId threshold_timestamp;
-	TransactionId threshold_xid;
-
-	Assert(TransactionIdIsNormal(recentXmin));
-	Assert(OldSnapshotThresholdActive());
-	Assert(limit_ts != NULL && limit_xid != NULL);
-
-	/*
-	 * TestForOldSnapshot() assumes early pruning advances the page LSN, so we
-	 * can't prune early when skipping WAL.
-	 */
-	if (!RelationAllowsEarlyPruning(relation) || !RelationNeedsWAL(relation))
-		return false;
-
-	ts = GetSnapshotCurrentTimestamp();
-
-	SpinLockAcquire(&oldSnapshotControl->mutex_latest_xmin);
-	latest_xmin = oldSnapshotControl->latest_xmin;
-	next_map_update_ts = oldSnapshotControl->next_map_update;
-	SpinLockRelease(&oldSnapshotControl->mutex_latest_xmin);
-
-	/*
-	 * Zero threshold always overrides to latest xmin, if valid.  Without some
-	 * heuristic it will find its own snapshot too old on, for example, a
-	 * simple UPDATE -- which would make it useless for most testing, but
-	 * there is no principled way to ensure that it doesn't fail in this way.
-	 * Use a five-second delay to try to get useful testing behavior, but this
-	 * may need adjustment.
-	 */
-	if (old_snapshot_threshold == 0)
-	{
-		if (TransactionIdPrecedes(latest_xmin, MyProc->xmin)
-			&& TransactionIdFollows(latest_xmin, xlimit))
-			xlimit = latest_xmin;
-
-		ts -= 5 * USECS_PER_SEC;
-	}
-	else
-	{
-		ts = AlignTimestampToMinuteBoundary(ts)
-			- (old_snapshot_threshold * USECS_PER_MINUTE);
-
-		/* Check for fast exit without LW locking. */
-		SpinLockAcquire(&oldSnapshotControl->mutex_threshold);
-		threshold_timestamp = oldSnapshotControl->threshold_timestamp;
-		threshold_xid = oldSnapshotControl->threshold_xid;
-		SpinLockRelease(&oldSnapshotControl->mutex_threshold);
-
-		if (ts == threshold_timestamp)
-		{
-			/*
-			 * Current timestamp is in same bucket as the last limit that was
-			 * applied. Reuse.
-			 */
-			xlimit = threshold_xid;
-		}
-		else if (ts == next_map_update_ts)
-		{
-			/*
-			 * FIXME: This branch is super iffy - but that should probably
-			 * fixed separately.
-			 */
-			xlimit = latest_xmin;
-		}
-		else if (GetOldSnapshotFromTimeMapping(ts, &xlimit))
-		{
-		}
-
-		/*
-		 * Failsafe protection against vacuuming work of active transaction.
-		 *
-		 * This is not an assertion because we avoid the spinlock for
-		 * performance, leaving open the possibility that xlimit could advance
-		 * and be more current; but it seems prudent to apply this limit.  It
-		 * might make pruning a tiny bit less aggressive than it could be, but
-		 * protects against data loss bugs.
-		 */
-		if (TransactionIdIsNormal(latest_xmin)
-			&& TransactionIdPrecedes(latest_xmin, xlimit))
-			xlimit = latest_xmin;
-	}
-
-	if (TransactionIdIsValid(xlimit) &&
-		TransactionIdFollowsOrEquals(xlimit, recentXmin))
-	{
-		*limit_ts = ts;
-		*limit_xid = xlimit;
-
-		return true;
-	}
-
-	return false;
-}
-
-/*
- * Take care of the circular buffer that maps time to xid.
- */
-void
-MaintainOldSnapshotTimeMapping(TimestampTz whenTaken, TransactionId xmin)
-{
-	TimestampTz ts;
-	TransactionId latest_xmin;
-	TimestampTz update_ts;
-	bool		map_update_required = false;
-
-	/* Never call this function when old snapshot checking is disabled. */
-	Assert(old_snapshot_threshold >= 0);
-
-	ts = AlignTimestampToMinuteBoundary(whenTaken);
-
-	/*
-	 * Keep track of the latest xmin seen by any process. Update mapping with
-	 * a new value when we have crossed a bucket boundary.
-	 */
-	SpinLockAcquire(&oldSnapshotControl->mutex_latest_xmin);
-	latest_xmin = oldSnapshotControl->latest_xmin;
-	update_ts = oldSnapshotControl->next_map_update;
-	if (ts > update_ts)
-	{
-		oldSnapshotControl->next_map_update = ts;
-		map_update_required = true;
-	}
-	if (TransactionIdFollows(xmin, latest_xmin))
-		oldSnapshotControl->latest_xmin = xmin;
-	SpinLockRelease(&oldSnapshotControl->mutex_latest_xmin);
-
-	/* We only needed to update the most recent xmin value. */
-	if (!map_update_required)
-		return;
-
-	/* No further tracking needed for 0 (used for testing). */
-	if (old_snapshot_threshold == 0)
-		return;
-
-	/*
-	 * We don't want to do something stupid with unusual values, but we don't
-	 * want to litter the log with warnings or break otherwise normal
-	 * processing for this feature; so if something seems unreasonable, just
-	 * log at DEBUG level and return without doing anything.
-	 */
-	if (whenTaken < 0)
-	{
-		elog(DEBUG1,
-			 "MaintainOldSnapshotTimeMapping called with negative whenTaken = %ld",
-			 (long) whenTaken);
-		return;
-	}
-	if (!TransactionIdIsNormal(xmin))
-	{
-		elog(DEBUG1,
-			 "MaintainOldSnapshotTimeMapping called with xmin = %lu",
-			 (unsigned long) xmin);
-		return;
-	}
-
-	LWLockAcquire(OldSnapshotTimeMapLock, LW_EXCLUSIVE);
-
-	Assert(oldSnapshotControl->head_offset >= 0);
-	Assert(oldSnapshotControl->head_offset < OLD_SNAPSHOT_TIME_MAP_ENTRIES);
-	Assert((oldSnapshotControl->head_timestamp % USECS_PER_MINUTE) == 0);
-	Assert(oldSnapshotControl->count_used >= 0);
-	Assert(oldSnapshotControl->count_used <= OLD_SNAPSHOT_TIME_MAP_ENTRIES);
-
-	if (oldSnapshotControl->count_used == 0)
-	{
-		/* set up first entry for empty mapping */
-		oldSnapshotControl->head_offset = 0;
-		oldSnapshotControl->head_timestamp = ts;
-		oldSnapshotControl->count_used = 1;
-		oldSnapshotControl->xid_by_minute[0] = xmin;
-	}
-	else if (ts < oldSnapshotControl->head_timestamp)
-	{
-		/* old ts; log it at DEBUG */
-		LWLockRelease(OldSnapshotTimeMapLock);
-		elog(DEBUG1,
-			 "MaintainOldSnapshotTimeMapping called with old whenTaken = %ld",
-			 (long) whenTaken);
-		return;
-	}
-	else if (ts <= (oldSnapshotControl->head_timestamp +
-					((oldSnapshotControl->count_used - 1)
-					 * USECS_PER_MINUTE)))
-	{
-		/* existing mapping; advance xid if possible */
-		int			bucket = (oldSnapshotControl->head_offset
-							  + ((ts - oldSnapshotControl->head_timestamp)
-								 / USECS_PER_MINUTE))
-		% OLD_SNAPSHOT_TIME_MAP_ENTRIES;
-
-		if (TransactionIdPrecedes(oldSnapshotControl->xid_by_minute[bucket], xmin))
-			oldSnapshotControl->xid_by_minute[bucket] = xmin;
-	}
-	else
-	{
-		/* We need a new bucket, but it might not be the very next one. */
-		int			distance_to_new_tail;
-		int			distance_to_current_tail;
-		int			advance;
-
-		/*
-		 * Our goal is for the new "tail" of the mapping, that is, the entry
-		 * which is newest and thus furthest from the "head" entry, to
-		 * correspond to "ts". Since there's one entry per minute, the
-		 * distance between the current head and the new tail is just the
-		 * number of minutes of difference between ts and the current
-		 * head_timestamp.
-		 *
-		 * The distance from the current head to the current tail is one less
-		 * than the number of entries in the mapping, because the entry at the
-		 * head_offset is for 0 minutes after head_timestamp.
-		 *
-		 * The difference between these two values is the number of minutes by
-		 * which we need to advance the mapping, either adding new entries or
-		 * rotating old ones out.
-		 */
-		distance_to_new_tail =
-			(ts - oldSnapshotControl->head_timestamp) / USECS_PER_MINUTE;
-		distance_to_current_tail =
-			oldSnapshotControl->count_used - 1;
-		advance = distance_to_new_tail - distance_to_current_tail;
-		Assert(advance > 0);
-
-		if (advance >= OLD_SNAPSHOT_TIME_MAP_ENTRIES)
-		{
-			/* Advance is so far that all old data is junk; start over. */
-			oldSnapshotControl->head_offset = 0;
-			oldSnapshotControl->count_used = 1;
-			oldSnapshotControl->xid_by_minute[0] = xmin;
-			oldSnapshotControl->head_timestamp = ts;
-		}
-		else
-		{
-			/* Store the new value in one or more buckets. */
-			int			i;
-
-			for (i = 0; i < advance; i++)
-			{
-				if (oldSnapshotControl->count_used == OLD_SNAPSHOT_TIME_MAP_ENTRIES)
-				{
-					/* Map full and new value replaces old head. */
-					int			old_head = oldSnapshotControl->head_offset;
-
-					if (old_head == (OLD_SNAPSHOT_TIME_MAP_ENTRIES - 1))
-						oldSnapshotControl->head_offset = 0;
-					else
-						oldSnapshotControl->head_offset = old_head + 1;
-					oldSnapshotControl->xid_by_minute[old_head] = xmin;
-					oldSnapshotControl->head_timestamp += USECS_PER_MINUTE;
-				}
-				else
-				{
-					/* Extend map to unused entry. */
-					int			new_tail = (oldSnapshotControl->head_offset
-											+ oldSnapshotControl->count_used)
-					% OLD_SNAPSHOT_TIME_MAP_ENTRIES;
-
-					oldSnapshotControl->count_used++;
-					oldSnapshotControl->xid_by_minute[new_tail] = xmin;
-				}
-			}
-		}
-	}
-
-	LWLockRelease(OldSnapshotTimeMapLock);
-}
-
-
 /*
  * Setup a snapshot that replaces normal catalog snapshots that allows catalog
  * access to behave just like it did at a certain point in the past.
@@ -2113,8 +1717,6 @@ SerializeSnapshot(Snapshot snapshot, char *start_address)
 	serialized_snapshot.suboverflowed = snapshot->suboverflowed;
 	serialized_snapshot.takenDuringRecovery = snapshot->takenDuringRecovery;
 	serialized_snapshot.curcid = snapshot->curcid;
-	serialized_snapshot.whenTaken = snapshot->whenTaken;
-	serialized_snapshot.lsn = snapshot->lsn;
 
 	/*
 	 * Ignore the SubXID array if it has overflowed, unless the snapshot was
@@ -2187,8 +1789,6 @@ RestoreSnapshot(char *start_address)
 	snapshot->suboverflowed = serialized_snapshot.suboverflowed;
 	snapshot->takenDuringRecovery = serialized_snapshot.takenDuringRecovery;
 	snapshot->curcid = serialized_snapshot.curcid;
-	snapshot->whenTaken = serialized_snapshot.whenTaken;
-	snapshot->lsn = serialized_snapshot.lsn;
 	snapshot->snapXactCompletionCount = 0;
 
 	/* Copy XIDs, if present. */
@@ -2213,6 +1813,10 @@ RestoreSnapshot(char *start_address)
 	snapshot->active_count = 0;
 	snapshot->copied = true;
 
+	/* Inherit timeout, but make sure we enable timer in this backend. */
+	snapshot->whenTaken = serialized_snapshot.whenTaken;
+	EnableSnapshotTimeout(snapshot, false);
+
 	return snapshot;
 }
 
@@ -2349,3 +1953,136 @@ XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
 
 	return false;
 }
+
+static inline bool
+SnapshotInTimedSnapshots(Snapshot snapshot)
+{
+	return
+		snapshot->timeout_state == SNAPSHOT_TIMEOUT_BLOCKING ||
+		snapshot->timeout_state == SNAPSHOT_TIMEOUT_WAITING;
+}
+
+static void
+InsertIntoTimedSnapshots(Snapshot snapshot)
+{
+	Assert(!SnapshotInTimedSnapshots(snapshot));
+
+	/*
+	 * Since it's very unlikely for snapshots to arrive with out-of-order
+	 * xmins, we'll start off by pushing at the tail...
+	 */
+	dlist_push_tail(&TimedSnapshots, &snapshot->timeout_node);
+
+	/* ... but adjust if not in order. */
+	while (dlist_has_prev(&TimedSnapshots, &snapshot->timeout_node))
+	{
+		Snapshot previous = dlist_container(SnapshotData,
+											timeout_node,
+											dlist_prev_node(&TimedSnapshots,
+															&snapshot->timeout_node));
+
+		Assert(previous != snapshot);
+		if (likely(TransactionIdPrecedesOrEquals(previous->xmin, snapshot->xmin)))
+			break;
+		dlist_delete(&snapshot->timeout_node);
+		dlist_insert_before(&previous->timeout_node, &snapshot->timeout_node);
+	}
+}
+
+/*
+ * See if the oldest timed snapshot is too old yet, and if so, raise an error.
+ */
+void HandleSnapshotTimeout(void)
+{
+	TimestampTz			now = GetCurrentTimestamp();
+	TimestampTz			too_old_cutoff;
+	bool				reset = false;
+
+	/* Time out all snapshots that are were taken before this point in time. */
+	too_old_cutoff = now - old_snapshot_threshold * 1000;
+
+	while (!dlist_is_empty(&TimedSnapshots))
+	{
+		Snapshot snapshot;
+
+		snapshot = dlist_container(SnapshotData, timeout_node,
+								   dlist_head_node(&TimedSnapshots));
+		Assert(SnapshotInTimedSnapshots(snapshot));
+
+		if (snapshot->timeout_state == SNAPSHOT_TIMEOUT_BLOCKING)
+		{
+			/* A special snapshot is at the front of the queue. */
+			break;
+		}
+		else if (snapshot->whenTaken <= too_old_cutoff)
+		{
+			/* Timed out.  Remove from the queue. */
+			dlist_delete(&snapshot->timeout_node);
+			snapshot->timeout_state = SNAPSHOT_TIMEOUT_REACHED;
+			reset = true;
+		}
+		else
+		{
+			TimestampTz expiry_time;
+
+			/* Set up the timer for the new oldest snapshot, and exit loop. */
+			expiry_time = snapshot->whenTaken + old_snapshot_threshold * 1000;
+			enable_timeout_after(SNAPSHOT_TIMEOUT, (expiry_time - now) / 1000);
+			break;
+		}
+	}
+
+	if (reset)
+		SnapshotResetXmin();
+}
+
+/*
+ * EnableSnapshotTimeout
+ *		Configure the time at which a snapshot becomes too old to use.
+ *
+ * This should be called whenever a snapshot is created.  If "block" is passed
+ * in, the snapshot is not one that is allowed to time out, but should still
+ * hold back xmin advancing (eg catalog snapshots).
+ */
+static void
+EnableSnapshotTimeout(Snapshot snapshot, bool block)
+{
+	/* Skip if not configured. */
+	if (old_snapshot_threshold < 0)
+		return;
+
+	/*
+	 * Capture the time.  This could be done inside GetSnapshotData[Reuse],
+	 * but it seems better to keep the logic for whether we bother to do this
+	 * work in one place, which runs pretty soon after snapshots are taken.
+	 */
+	Assert(snapshot->whenTaken == 0);
+	snapshot->whenTaken = GetCurrentTimestamp();
+
+	/* Shouldn't be already in the heap yet. */
+	Assert(!SnapshotInTimedSnapshots(snapshot));
+
+	/* Put it in the heap and start the timer if appropriate. */
+	InsertIntoTimedSnapshots(snapshot);
+	snapshot->timeout_state = block ? SNAPSHOT_TIMEOUT_BLOCKING :
+		SNAPSHOT_TIMEOUT_WAITING;
+	if (!block)
+		enable_timeout_after(SNAPSHOT_TIMEOUT, old_snapshot_threshold);
+}
+
+/*
+ * Called automatically by FreeSnapshot(), but can be called explicitly when
+ * abandoning snapshots for cleanup by memory contexts.
+ */
+static bool
+ForgetSnapshotTimeout(Snapshot snapshot)
+{
+	/* If it's in the timeout heap, remove it. */
+	if (SnapshotInTimedSnapshots(snapshot))
+	{
+		dlist_delete(&snapshot->timeout_node);
+		snapshot->timeout_state = SNAPSHOT_TIMEOUT_NONE;
+		return true;
+	}
+	return false;
+}
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index e63b49fc38..64754fef9e 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -184,8 +184,6 @@ struct GlobalVisState;
 extern void heap_page_prune_opt(Relation relation, Buffer buffer);
 extern int	heap_page_prune(Relation relation, Buffer buffer,
 							struct GlobalVisState *vistest,
-							TransactionId old_snap_xmin,
-							TimestampTz old_snap_ts_ts,
 							bool report_stats,
 							OffsetNumber *off_loc);
 extern void heap_page_prune_execute(Buffer buffer,
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 4dc343cbc5..7fda8d38a2 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -92,6 +92,7 @@ extern PGDLLIMPORT volatile sig_atomic_t QueryCancelPending;
 extern PGDLLIMPORT volatile sig_atomic_t ProcDiePending;
 extern PGDLLIMPORT volatile sig_atomic_t IdleInTransactionSessionTimeoutPending;
 extern PGDLLIMPORT volatile sig_atomic_t IdleSessionTimeoutPending;
+extern PGDLLIMPORT volatile sig_atomic_t SnapshotTimeoutPending;
 extern PGDLLIMPORT volatile sig_atomic_t ProcSignalBarrierPending;
 extern PGDLLIMPORT volatile sig_atomic_t LogMemoryContextPending;
 
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index aa64fb42ec..41ec1bf2c6 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -162,9 +162,6 @@ extern PGDLLIMPORT int32 *LocalRefCount;
 /*
  * BufferGetPage
  *		Returns the page associated with a buffer.
- *
- * When this is called as part of a scan, there may be a need for a nearby
- * call to TestForOldSnapshot().  See the definition of that for details.
  */
 #define BufferGetPage(buffer) ((Page)BufferGetBlock(buffer))
 
@@ -240,55 +237,8 @@ extern bool BgBufferSync(struct WritebackContext *wb_context);
 
 extern void AtProcExit_LocalBuffers(void);
 
-extern void TestForOldSnapshot_impl(Snapshot snapshot, Relation relation);
-
 /* in freelist.c */
 extern BufferAccessStrategy GetAccessStrategy(BufferAccessStrategyType btype);
 extern void FreeAccessStrategy(BufferAccessStrategy strategy);
 
-
-/* inline functions */
-
-/*
- * Although this header file is nominally backend-only, certain frontend
- * programs like pg_waldump include it.  For compilers that emit static
- * inline functions even when they're unused, that leads to unsatisfied
- * external references; hence hide these with #ifndef FRONTEND.
- */
-
-#ifndef FRONTEND
-
-/*
- * Check whether the given snapshot is too old to have safely read the given
- * page from the given table.  If so, throw a "snapshot too old" error.
- *
- * This test generally needs to be performed after every BufferGetPage() call
- * that is executed as part of a scan.  It is not needed for calls made for
- * modifying the page (for example, to position to the right place to insert a
- * new index tuple or for vacuuming).  It may also be omitted where calls to
- * lower-level functions will have already performed the test.
- *
- * Note that a NULL snapshot argument is allowed and causes a fast return
- * without error; this is to support call sites which can be called from
- * either scans or index modification areas.
- *
- * For best performance, keep the tests that are fastest and/or most likely to
- * exclude a page from old snapshot testing near the front.
- */
-static inline void
-TestForOldSnapshot(Snapshot snapshot, Relation relation, Page page)
-{
-	Assert(relation != NULL);
-
-	if (old_snapshot_threshold >= 0
-		&& (snapshot) != NULL
-		&& ((snapshot)->snapshot_type == SNAPSHOT_MVCC
-			|| (snapshot)->snapshot_type == SNAPSHOT_TOAST)
-		&& !XLogRecPtrIsInvalid((snapshot)->lsn)
-		&& PageGetLSN(page) > (snapshot)->lsn)
-		TestForOldSnapshot_impl(snapshot, relation);
-}
-
-#endif							/* FRONTEND */
-
 #endif							/* BUFMGR_H */
diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h
index 44539fe15a..82adda6137 100644
--- a/src/include/utils/snapmgr.h
+++ b/src/include/utils/snapmgr.h
@@ -19,40 +19,10 @@
 #include "utils/snapshot.h"
 
 
-/*
- * The structure used to map times to TransactionId values for the "snapshot
- * too old" feature must have a few entries at the tail to hold old values;
- * otherwise the lookup will often fail and the expected early pruning or
- * vacuum will not usually occur.  It is best if this padding is for a number
- * of minutes greater than a thread would normally be stalled, but it's OK if
- * early vacuum opportunities are occasionally missed, so there's no need to
- * use an extreme value or get too fancy.  10 minutes seems plenty.
- */
-#define OLD_SNAPSHOT_PADDING_ENTRIES 10
-#define OLD_SNAPSHOT_TIME_MAP_ENTRIES (old_snapshot_threshold + OLD_SNAPSHOT_PADDING_ENTRIES)
-
-/*
- * Common definition of relation properties that allow early pruning/vacuuming
- * when old_snapshot_threshold >= 0.
- */
-#define RelationAllowsEarlyPruning(rel) \
-( \
-	 RelationIsPermanent(rel) && !IsCatalogRelation(rel) \
-  && !RelationIsAccessibleInLogicalDecoding(rel) \
-)
-
-#define EarlyPruningEnabled(rel) (old_snapshot_threshold >= 0 && RelationAllowsEarlyPruning(rel))
-
 /* GUC variables */
 extern PGDLLIMPORT int old_snapshot_threshold;
 
 
-extern Size SnapMgrShmemSize(void);
-extern void SnapMgrInit(void);
-extern TimestampTz GetSnapshotCurrentTimestamp(void);
-extern TimestampTz GetOldSnapshotThresholdTimestamp(void);
-extern void SnapshotTooOldMagicForTest(void);
-
 extern bool FirstSnapshotSet;
 
 extern PGDLLIMPORT TransactionId TransactionXmin;
@@ -72,7 +42,8 @@ extern PGDLLIMPORT SnapshotData CatalogSnapshotData;
  * local variable of type SnapshotData, and initialize it with this macro.
  */
 #define InitDirtySnapshot(snapshotdata)  \
-	((snapshotdata).snapshot_type = SNAPSHOT_DIRTY)
+	((snapshotdata).snapshot_type = SNAPSHOT_DIRTY,	\
+	 (snapshotdata).timeout_state = SNAPSHOT_TIMEOUT_NONE)
 
 /*
  * Similarly, some initialization is required for a NonVacuumable snapshot.
@@ -81,16 +52,15 @@ extern PGDLLIMPORT SnapshotData CatalogSnapshotData;
  */
 #define InitNonVacuumableSnapshot(snapshotdata, vistestp)  \
 	((snapshotdata).snapshot_type = SNAPSHOT_NON_VACUUMABLE, \
-	 (snapshotdata).vistest = (vistestp))
+	 (snapshotdata).vistest = (vistestp), \
+	 (snapshotdata).timeout_state = SNAPSHOT_TIMEOUT_NONE)
 
 /*
- * Similarly, some initialization is required for SnapshotToast.  We need
- * to set lsn and whenTaken correctly to support snapshot_too_old.
+ * Similarly, some initialization is required for SnapshotToast.
  */
-#define InitToastSnapshot(snapshotdata, l, w)  \
+#define InitToastSnapshot(snapshotdata)  \
 	((snapshotdata).snapshot_type = SNAPSHOT_TOAST, \
-	 (snapshotdata).lsn = (l),					\
-	 (snapshotdata).whenTaken = (w))
+	 (snapshotdata).timeout_state = SNAPSHOT_TIMEOUT_NONE)
 
 /* This macro encodes the knowledge of which snapshots are MVCC-safe */
 #define IsMVCCSnapshot(snapshot)  \
@@ -105,8 +75,10 @@ OldSnapshotThresholdActive(void)
 
 extern Snapshot GetTransactionSnapshot(void);
 extern Snapshot GetLatestSnapshot(void);
-extern void SnapshotSetCommandId(CommandId curcid);
 extern Snapshot GetOldestSnapshot(void);
+extern void SnapshotSetCommandId(CommandId curcid);
+
+extern void HandleSnapshotTimeout(void);
 
 extern Snapshot GetCatalogSnapshot(Oid relid);
 extern Snapshot GetNonHistoricCatalogSnapshot(Oid relid);
@@ -134,13 +106,7 @@ extern bool XactHasExportedSnapshots(void);
 extern void DeleteAllExportedSnapshotFiles(void);
 extern void WaitForOlderSnapshots(TransactionId limitXmin, bool progress);
 extern bool ThereAreNoPriorRegisteredSnapshots(void);
-extern bool TransactionIdLimitedForOldSnapshots(TransactionId recentXmin,
-												Relation relation,
-												TransactionId *limit_xid,
-												TimestampTz *limit_ts);
 extern void SetOldSnapshotThresholdTimestamp(TimestampTz ts, TransactionId xlimit);
-extern void MaintainOldSnapshotTimeMapping(TimestampTz whenTaken,
-										   TransactionId xmin);
 
 extern char *ExportSnapshot(Snapshot snapshot);
 
diff --git a/src/include/utils/snapshot.h b/src/include/utils/snapshot.h
index 6b60755c53..f2eace4085 100644
--- a/src/include/utils/snapshot.h
+++ b/src/include/utils/snapshot.h
@@ -16,6 +16,7 @@
 #include "access/htup.h"
 #include "access/xlogdefs.h"
 #include "datatype/timestamp.h"
+#include "lib/ilist.h"
 #include "lib/pairingheap.h"
 #include "storage/buf.h"
 
@@ -118,6 +119,34 @@ typedef enum SnapshotType
 	SNAPSHOT_NON_VACUUMABLE
 } SnapshotType;
 
+typedef enum SnapshotTimeoutState {
+	/*
+	 * No timeout, and it is safe to use this snapshot.
+	 */
+	SNAPSHOT_TIMEOUT_NONE,
+
+	/*
+	 * The snapshot is in TimedSnapshots, holding back xmin, but is not a type
+	 * of snapshot that we allow to time out.  Safe to use.  It has this name
+	 * because it also blocks everything after it from timing out.
+	 */
+	SNAPSHOT_TIMEOUT_BLOCKING,
+
+	/*
+	 * The snapshot is in TimedSnapshots, but the timeout has not yet been
+	 * reached.  Safe to use.
+	 */
+	SNAPSHOT_TIMEOUT_WAITING,
+
+	/*
+	 * The snapshot is no longer in TimedSnaphots, and must not be used for
+	 * any visibility purposes (including heap access and visibility-map based
+	 * optimizations), because it no longer prevents xmin from advancing.  Data it
+	 * needs to see may be vacuumed away at any time.
+	 */
+	SNAPSHOT_TIMEOUT_REACHED
+} SnapshotTimeoutState;
+
 typedef struct SnapshotData *Snapshot;
 
 #define InvalidSnapshot		((Snapshot) NULL)
@@ -205,8 +234,9 @@ typedef struct SnapshotData
 	uint32		regd_count;		/* refcount on RegisteredSnapshots */
 	pairingheap_node ph_node;	/* link in the RegisteredSnapshots heap */
 
-	TimestampTz whenTaken;		/* timestamp when snapshot was taken */
-	XLogRecPtr	lsn;			/* position in the WAL stream when taken */
+	TimestampTz	whenTaken;		/* timestamp when snapshot was taken */
+	dlist_node timeout_node;	/* link in TimedSnapshots */
+	SnapshotTimeoutState timeout_state;
 
 	/*
 	 * The transaction completion count at the time GetSnapshotData() built
@@ -216,4 +246,13 @@ typedef struct SnapshotData
 	uint64		snapXactCompletionCount;
 } SnapshotData;
 
+static inline void
+CheckForOldSnapshot(Snapshot snapshot)
+{
+	if (unlikely(snapshot->timeout_state == SNAPSHOT_TIMEOUT_REACHED))
+		ereport(ERROR,
+				(errcode(ERRCODE_SNAPSHOT_TOO_OLD),
+				 errmsg("snapshot too old")));
+}
+
 #endif							/* SNAPSHOT_H */
diff --git a/src/include/utils/timeout.h b/src/include/utils/timeout.h
index 93e6a691b3..ea3e5a373a 100644
--- a/src/include/utils/timeout.h
+++ b/src/include/utils/timeout.h
@@ -33,6 +33,7 @@ typedef enum TimeoutId
 	IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
 	IDLE_SESSION_TIMEOUT,
 	CLIENT_CONNECTION_CHECK_TIMEOUT,
+	SNAPSHOT_TIMEOUT,
 	/* First user-definable timeout reason */
 	USER_TIMEOUT,
 	/* Maximum number of timeout reasons */
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index dffc79b2d9..f559feb6bc 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -12,7 +12,6 @@ SUBDIRS = \
 		  dummy_seclabel \
 		  libpq_pipeline \
 		  plsample \
-		  snapshot_too_old \
 		  spgist_name_ops \
 		  test_bloomfilter \
 		  test_ddl_deparse \
diff --git a/src/test/modules/snapshot_too_old/.gitignore b/src/test/modules/snapshot_too_old/.gitignore
deleted file mode 100644
index ba2160b66c..0000000000
--- a/src/test/modules/snapshot_too_old/.gitignore
+++ /dev/null
@@ -1,3 +0,0 @@
-# Generated subdirectories
-/output_iso/
-/tmp_check_iso/
diff --git a/src/test/modules/snapshot_too_old/Makefile b/src/test/modules/snapshot_too_old/Makefile
deleted file mode 100644
index dfb4537f63..0000000000
--- a/src/test/modules/snapshot_too_old/Makefile
+++ /dev/null
@@ -1,28 +0,0 @@
-# src/test/modules/snapshot_too_old/Makefile
-
-# Note: because we don't tell the Makefile there are any regression tests,
-# we have to clean those result files explicitly
-EXTRA_CLEAN = $(pg_regress_clean_files)
-
-ISOLATION = sto_using_cursor sto_using_select sto_using_hash_index
-ISOLATION_OPTS = --temp-config $(top_srcdir)/src/test/modules/snapshot_too_old/sto.conf
-
-# Disabled because these tests require "old_snapshot_threshold" >= 0, which
-# typical installcheck users do not have (e.g. buildfarm clients).
-NO_INSTALLCHECK = 1
-
-ifdef USE_PGXS
-PG_CONFIG = pg_config
-PGXS := $(shell $(PG_CONFIG) --pgxs)
-include $(PGXS)
-else
-subdir = src/test/modules/snapshot_too_old
-top_builddir = ../../../..
-include $(top_builddir)/src/Makefile.global
-include $(top_srcdir)/contrib/contrib-global.mk
-endif
-
-# But it can nonetheless be very helpful to run tests on preexisting
-# installation, allow to do so, but only if requested explicitly.
-installcheck-force:
-	$(pg_isolation_regress_installcheck) $(ISOLATION)
diff --git a/src/test/modules/snapshot_too_old/expected/sto_using_cursor.out b/src/test/modules/snapshot_too_old/expected/sto_using_cursor.out
deleted file mode 100644
index 8cc29ec82f..0000000000
--- a/src/test/modules/snapshot_too_old/expected/sto_using_cursor.out
+++ /dev/null
@@ -1,73 +0,0 @@
-Parsed test spec with 2 sessions
-
-starting permutation: s1decl s1f1 s1sleep s1f2 s2u
-step s1decl: DECLARE cursor1 CURSOR FOR SELECT c FROM sto1;
-step s1f1: FETCH FIRST FROM cursor1;
-c              
-
-1              
-step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold';
-setting        pg_sleep       
-
-0                             
-step s1f2: FETCH FIRST FROM cursor1;
-c              
-
-1              
-step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1;
-
-starting permutation: s1decl s1f1 s1sleep s2u s1f2
-step s1decl: DECLARE cursor1 CURSOR FOR SELECT c FROM sto1;
-step s1f1: FETCH FIRST FROM cursor1;
-c              
-
-1              
-step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold';
-setting        pg_sleep       
-
-0                             
-step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1;
-step s1f2: FETCH FIRST FROM cursor1;
-ERROR:  snapshot too old
-
-starting permutation: s1decl s1f1 s2u s1sleep s1f2
-step s1decl: DECLARE cursor1 CURSOR FOR SELECT c FROM sto1;
-step s1f1: FETCH FIRST FROM cursor1;
-c              
-
-1              
-step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1;
-step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold';
-setting        pg_sleep       
-
-0                             
-step s1f2: FETCH FIRST FROM cursor1;
-ERROR:  snapshot too old
-
-starting permutation: s1decl s2u s1f1 s1sleep s1f2
-step s1decl: DECLARE cursor1 CURSOR FOR SELECT c FROM sto1;
-step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1;
-step s1f1: FETCH FIRST FROM cursor1;
-c              
-
-1              
-step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold';
-setting        pg_sleep       
-
-0                             
-step s1f2: FETCH FIRST FROM cursor1;
-ERROR:  snapshot too old
-
-starting permutation: s2u s1decl s1f1 s1sleep s1f2
-step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1;
-step s1decl: DECLARE cursor1 CURSOR FOR SELECT c FROM sto1;
-step s1f1: FETCH FIRST FROM cursor1;
-c              
-
-2              
-step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold';
-setting        pg_sleep       
-
-0                             
-step s1f2: FETCH FIRST FROM cursor1;
-ERROR:  snapshot too old
diff --git a/src/test/modules/snapshot_too_old/expected/sto_using_hash_index.out b/src/test/modules/snapshot_too_old/expected/sto_using_hash_index.out
deleted file mode 100644
index bf94054790..0000000000
--- a/src/test/modules/snapshot_too_old/expected/sto_using_hash_index.out
+++ /dev/null
@@ -1,15 +0,0 @@
-Parsed test spec with 2 sessions
-
-starting permutation: noseq s1f1 s2sleep s2u s1f2
-step noseq: SET enable_seqscan = false;
-step s1f1: SELECT c FROM sto1 where c = 1000;
-c              
-
-1000           
-step s2sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold';
-setting        pg_sleep       
-
-0                             
-step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1000;
-step s1f2: SELECT c FROM sto1 where c = 1001;
-ERROR:  snapshot too old
diff --git a/src/test/modules/snapshot_too_old/expected/sto_using_select.out b/src/test/modules/snapshot_too_old/expected/sto_using_select.out
deleted file mode 100644
index eb15bc23bf..0000000000
--- a/src/test/modules/snapshot_too_old/expected/sto_using_select.out
+++ /dev/null
@@ -1,55 +0,0 @@
-Parsed test spec with 2 sessions
-
-starting permutation: s1f1 s1sleep s1f2 s2u
-step s1f1: SELECT c FROM sto1 ORDER BY c LIMIT 1;
-c              
-
-1              
-step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold';
-setting        pg_sleep       
-
-0                             
-step s1f2: SELECT c FROM sto1 ORDER BY c LIMIT 1;
-c              
-
-1              
-step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1;
-
-starting permutation: s1f1 s1sleep s2u s1f2
-step s1f1: SELECT c FROM sto1 ORDER BY c LIMIT 1;
-c              
-
-1              
-step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold';
-setting        pg_sleep       
-
-0                             
-step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1;
-step s1f2: SELECT c FROM sto1 ORDER BY c LIMIT 1;
-ERROR:  snapshot too old
-
-starting permutation: s1f1 s2u s1sleep s1f2
-step s1f1: SELECT c FROM sto1 ORDER BY c LIMIT 1;
-c              
-
-1              
-step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1;
-step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold';
-setting        pg_sleep       
-
-0                             
-step s1f2: SELECT c FROM sto1 ORDER BY c LIMIT 1;
-ERROR:  snapshot too old
-
-starting permutation: s2u s1f1 s1sleep s1f2
-step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1;
-step s1f1: SELECT c FROM sto1 ORDER BY c LIMIT 1;
-c              
-
-2              
-step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold';
-setting        pg_sleep       
-
-0                             
-step s1f2: SELECT c FROM sto1 ORDER BY c LIMIT 1;
-ERROR:  snapshot too old
diff --git a/src/test/modules/snapshot_too_old/specs/sto_using_cursor.spec b/src/test/modules/snapshot_too_old/specs/sto_using_cursor.spec
deleted file mode 100644
index eac18ca5b9..0000000000
--- a/src/test/modules/snapshot_too_old/specs/sto_using_cursor.spec
+++ /dev/null
@@ -1,37 +0,0 @@
-# This test provokes a "snapshot too old" error using a cursor.
-#
-# The sleep is needed because with a threshold of zero a statement could error
-# on changes it made.  With more normal settings no external delay is needed,
-# but we don't want these tests to run long enough to see that, since
-# granularity is in minutes.
-#
-# Since results depend on the value of old_snapshot_threshold, sneak that into
-# the line generated by the sleep, so that a surprising values isn't so hard
-# to identify.
-
-setup
-{
-    CREATE TABLE sto1 (c int NOT NULL);
-    INSERT INTO sto1 SELECT generate_series(1, 1000);
-    CREATE TABLE sto2 (c int NOT NULL);
-}
-setup
-{
-    VACUUM ANALYZE sto1;
-}
-
-teardown
-{
-    DROP TABLE sto1, sto2;
-}
-
-session "s1"
-setup			{ BEGIN ISOLATION LEVEL REPEATABLE READ; }
-step "s1decl"	{ DECLARE cursor1 CURSOR FOR SELECT c FROM sto1; }
-step "s1f1"		{ FETCH FIRST FROM cursor1; }
-step "s1sleep"	{ SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold'; }
-step "s1f2"		{ FETCH FIRST FROM cursor1; }
-teardown		{ COMMIT; }
-
-session "s2"
-step "s2u"		{ UPDATE sto1 SET c = 1001 WHERE c = 1; }
diff --git a/src/test/modules/snapshot_too_old/specs/sto_using_hash_index.spec b/src/test/modules/snapshot_too_old/specs/sto_using_hash_index.spec
deleted file mode 100644
index 33d91ff852..0000000000
--- a/src/test/modules/snapshot_too_old/specs/sto_using_hash_index.spec
+++ /dev/null
@@ -1,31 +0,0 @@
-# This test is like sto_using_select, except that we test access via a
-# hash index.
-
-setup
-{
-    CREATE TABLE sto1 (c int NOT NULL);
-    INSERT INTO sto1 SELECT generate_series(1, 1000);
-    CREATE INDEX idx_sto1 ON sto1 USING HASH (c);
-}
-setup
-{
-    VACUUM ANALYZE sto1;
-}
-
-teardown
-{
-    DROP TABLE sto1;
-}
-
-session "s1"
-setup			{ BEGIN ISOLATION LEVEL REPEATABLE READ; }
-step "noseq"	{ SET enable_seqscan = false; }
-step "s1f1"		{ SELECT c FROM sto1 where c = 1000; }
-step "s1f2"		{ SELECT c FROM sto1 where c = 1001; }
-teardown		{ ROLLBACK; }
-
-session "s2"
-step "s2sleep"	{ SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold'; }
-step "s2u"		{ UPDATE sto1 SET c = 1001 WHERE c = 1000; }
-
-permutation "noseq" "s1f1" "s2sleep" "s2u" "s1f2"
diff --git a/src/test/modules/snapshot_too_old/specs/sto_using_select.spec b/src/test/modules/snapshot_too_old/specs/sto_using_select.spec
deleted file mode 100644
index d7c34f3d89..0000000000
--- a/src/test/modules/snapshot_too_old/specs/sto_using_select.spec
+++ /dev/null
@@ -1,36 +0,0 @@
-# This test provokes a "snapshot too old" error using SELECT statements.
-#
-# The sleep is needed because with a threshold of zero a statement could error
-# on changes it made.  With more normal settings no external delay is needed,
-# but we don't want these tests to run long enough to see that, since
-# granularity is in minutes.
-#
-# Since results depend on the value of old_snapshot_threshold, sneak that into
-# the line generated by the sleep, so that a surprising values isn't so hard
-# to identify.
-
-setup
-{
-    CREATE TABLE sto1 (c int NOT NULL);
-    INSERT INTO sto1 SELECT generate_series(1, 1000);
-    CREATE TABLE sto2 (c int NOT NULL);
-}
-setup
-{
-    VACUUM ANALYZE sto1;
-}
-
-teardown
-{
-    DROP TABLE sto1, sto2;
-}
-
-session "s1"
-setup			{ BEGIN ISOLATION LEVEL REPEATABLE READ; }
-step "s1f1"		{ SELECT c FROM sto1 ORDER BY c LIMIT 1; }
-step "s1sleep"	{ SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold'; }
-step "s1f2"		{ SELECT c FROM sto1 ORDER BY c LIMIT 1; }
-teardown		{ COMMIT; }
-
-session "s2"
-step "s2u"		{ UPDATE sto1 SET c = 1001 WHERE c = 1; }
diff --git a/src/test/modules/snapshot_too_old/sto.conf b/src/test/modules/snapshot_too_old/sto.conf
deleted file mode 100644
index 7eeaeeb0dc..0000000000
--- a/src/test/modules/snapshot_too_old/sto.conf
+++ /dev/null
@@ -1,2 +0,0 @@
-autovacuum = off
-old_snapshot_threshold = 0
-- 
2.30.2

Reply via email to