On Fri, Sep 25, 2020 at 2:00 PM Michael Paquier <mich...@paquier.xyz> wrote: > On Thu, Sep 24, 2020 at 03:46:14PM -0400, Robert Haas wrote: > > Committed. > > Cool, thanks.
+1 > > Thomas, with respect to your part of this patch set, I wonder if we > > can make the functions that you're using to write tests safe enough > > that we could add them to contrib/old_snapshot and let users run them > > if they want. As you have them, they are hedged around with vague and > > scary warnings, but is that really justified? And if so, can it be > > fixed? It would be nicer not to end up with two loadable modules here, > > and maybe the right sorts of functions could even have some practical > > use. Yeah, you may be right. I am thinking about that. In the meantime, here is a rebase. A quick recap of these remaining patches: 0001 replaces the current "magic test mode" that didn't really test anything with a new test mode that verifies pruning and STO behaviour. 0002 fixes a separate bug that Andres reported: the STO XID map suffers from wraparound-itis. 0003 adds a simple smoke test for Robert's commit 55b7e2f4. Before that fix landed, it failed. > I have switched this item as waiting on author in the CF app then, as > we are not completely done yet. Thanks. For the record, I think there is still one more complaint from Andres that remains unaddressed even once these are in the tree: there are thought to be some more places that lack TestForOldSnapshot() calls.
From d2909b13737cc74c1b3d6896ac4375d611f32df2 Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Mon, 20 Apr 2020 16:23:02 +1200 Subject: [PATCH v10 1/3] Rewrite the "snapshot_too_old" tests. Previously the snapshot too old feature used a special test value for old_snapshot_threshold. Instead, use a new approach based on explicitly advancing the "current" timestamp used in snapshot-too-old book keeping, so that we can control the timeline precisely without having to resort to sleeping or special test branches in the code. Also check that early pruning actually happens, by vacuuming and inspecting the visibility map at key points in the test schedule. Discussion: https://postgr.es/m/CA%2BTgmoY%3Daqf0zjTD%2B3dUWYkgMiNDegDLFjo%2B6ze%3DWtpik%2B3XqA%40mail.gmail.com --- src/backend/utils/time/snapmgr.c | 21 --- src/test/modules/snapshot_too_old/Makefile | 23 +-- .../expected/sto_using_cursor.out | 75 ++++----- .../expected/sto_using_hash_index.out | 26 ++- .../expected/sto_using_select.out | 157 +++++++++++++++--- .../specs/sto_using_cursor.spec | 30 ++-- .../specs/sto_using_hash_index.spec | 19 ++- .../specs/sto_using_select.spec | 50 ++++-- src/test/modules/snapshot_too_old/sto.conf | 2 +- .../snapshot_too_old/test_sto--1.0.sql | 14 ++ src/test/modules/snapshot_too_old/test_sto.c | 74 +++++++++ .../modules/snapshot_too_old/test_sto.control | 5 + 12 files changed, 366 insertions(+), 130 deletions(-) create mode 100644 src/test/modules/snapshot_too_old/test_sto--1.0.sql create mode 100644 src/test/modules/snapshot_too_old/test_sto.c create mode 100644 src/test/modules/snapshot_too_old/test_sto.control diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c index 8c41483e87..1d80770136 100644 --- a/src/backend/utils/time/snapmgr.c +++ b/src/backend/utils/time/snapmgr.c @@ -1774,23 +1774,6 @@ TransactionIdLimitedForOldSnapshots(TransactionId recentXmin, next_map_update_ts = oldSnapshotControl->next_map_update; SpinLockRelease(&oldSnapshotControl->mutex_latest_xmin); - /* - * Zero threshold always overrides to latest xmin, if valid. Without some - * heuristic it will find its own snapshot too old on, for example, a - * simple UPDATE -- which would make it useless for most testing, but - * there is no principled way to ensure that it doesn't fail in this way. - * Use a five-second delay to try to get useful testing behavior, but this - * may need adjustment. - */ - if (old_snapshot_threshold == 0) - { - if (TransactionIdPrecedes(latest_xmin, MyProc->xmin) - && TransactionIdFollows(latest_xmin, xlimit)) - xlimit = latest_xmin; - - ts -= 5 * USECS_PER_SEC; - } - else { ts = AlignTimestampToMinuteBoundary(ts) - (old_snapshot_threshold * USECS_PER_MINUTE); @@ -1883,10 +1866,6 @@ MaintainOldSnapshotTimeMapping(TimestampTz whenTaken, TransactionId xmin) if (!map_update_required) return; - /* No further tracking needed for 0 (used for testing). */ - if (old_snapshot_threshold == 0) - return; - /* * We don't want to do something stupid with unusual values, but we don't * want to litter the log with warnings or break otherwise normal diff --git a/src/test/modules/snapshot_too_old/Makefile b/src/test/modules/snapshot_too_old/Makefile index dfb4537f63..81836e9953 100644 --- a/src/test/modules/snapshot_too_old/Makefile +++ b/src/test/modules/snapshot_too_old/Makefile @@ -1,14 +1,22 @@ # src/test/modules/snapshot_too_old/Makefile -# Note: because we don't tell the Makefile there are any regression tests, -# we have to clean those result files explicitly -EXTRA_CLEAN = $(pg_regress_clean_files) +MODULE_big = test_sto +OBJS = \ + $(WIN32RES) \ + test_sto.o + +EXTENSION = test_sto +DATA = test_sto--1.0.sql +PGDESCFILE = "test_sto -- internal testing for snapshot too old errors" + +EXTRA_INSTALL = contrib/pg_visibility ISOLATION = sto_using_cursor sto_using_select sto_using_hash_index ISOLATION_OPTS = --temp-config $(top_srcdir)/src/test/modules/snapshot_too_old/sto.conf -# Disabled because these tests require "old_snapshot_threshold" >= 0, which -# typical installcheck users do not have (e.g. buildfarm clients). +# Disabled because these tests require "old_snapshot_threshold" = 10, which +# typical installcheck users do not have (e.g. buildfarm clients) and also +# because it'd be dangerous on a production system. NO_INSTALLCHECK = 1 ifdef USE_PGXS @@ -21,8 +29,3 @@ top_builddir = ../../../.. include $(top_builddir)/src/Makefile.global include $(top_srcdir)/contrib/contrib-global.mk endif - -# But it can nonetheless be very helpful to run tests on preexisting -# installation, allow to do so, but only if requested explicitly. -installcheck-force: - $(pg_isolation_regress_installcheck) $(ISOLATION) diff --git a/src/test/modules/snapshot_too_old/expected/sto_using_cursor.out b/src/test/modules/snapshot_too_old/expected/sto_using_cursor.out index 8cc29ec82f..b007e2dc17 100644 --- a/src/test/modules/snapshot_too_old/expected/sto_using_cursor.out +++ b/src/test/modules/snapshot_too_old/expected/sto_using_cursor.out @@ -1,73 +1,60 @@ -Parsed test spec with 2 sessions +Parsed test spec with 3 sessions -starting permutation: s1decl s1f1 s1sleep s1f2 s2u +starting permutation: t00 s1decl s1f1 t10 s2u s1f2 t20 s1f3 +step t00: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:00:00Z'); +test_sto_clobber_snapshot_timestamp + + step s1decl: DECLARE cursor1 CURSOR FOR SELECT c FROM sto1; step s1f1: FETCH FIRST FROM cursor1; c 1 -step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold'; -setting pg_sleep +step t10: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:10:00Z'); +test_sto_clobber_snapshot_timestamp -0 + +step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1; step s1f2: FETCH FIRST FROM cursor1; c 1 -step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1; +step t20: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:20:00Z'); +test_sto_clobber_snapshot_timestamp -starting permutation: s1decl s1f1 s1sleep s2u s1f2 -step s1decl: DECLARE cursor1 CURSOR FOR SELECT c FROM sto1; -step s1f1: FETCH FIRST FROM cursor1; -c + +step s1f3: FETCH FIRST FROM cursor1; +ERROR: snapshot too old +test_sto_reset_all_state -1 -step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold'; -setting pg_sleep + -0 -step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1; -step s1f2: FETCH FIRST FROM cursor1; -ERROR: snapshot too old +starting permutation: t00 s1decl s1f1 t10 s1f2 t20 s1f3 +step t00: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:00:00Z'); +test_sto_clobber_snapshot_timestamp -starting permutation: s1decl s1f1 s2u s1sleep s1f2 + step s1decl: DECLARE cursor1 CURSOR FOR SELECT c FROM sto1; step s1f1: FETCH FIRST FROM cursor1; c 1 -step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1; -step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold'; -setting pg_sleep +step t10: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:10:00Z'); +test_sto_clobber_snapshot_timestamp -0 + step s1f2: FETCH FIRST FROM cursor1; -ERROR: snapshot too old - -starting permutation: s1decl s2u s1f1 s1sleep s1f2 -step s1decl: DECLARE cursor1 CURSOR FOR SELECT c FROM sto1; -step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1; -step s1f1: FETCH FIRST FROM cursor1; c 1 -step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold'; -setting pg_sleep - -0 -step s1f2: FETCH FIRST FROM cursor1; -ERROR: snapshot too old +step t20: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:20:00Z'); +test_sto_clobber_snapshot_timestamp -starting permutation: s2u s1decl s1f1 s1sleep s1f2 -step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1; -step s1decl: DECLARE cursor1 CURSOR FOR SELECT c FROM sto1; -step s1f1: FETCH FIRST FROM cursor1; + +step s1f3: FETCH FIRST FROM cursor1; c -2 -step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold'; -setting pg_sleep +1 +test_sto_reset_all_state -0 -step s1f2: FETCH FIRST FROM cursor1; -ERROR: snapshot too old + diff --git a/src/test/modules/snapshot_too_old/expected/sto_using_hash_index.out b/src/test/modules/snapshot_too_old/expected/sto_using_hash_index.out index bf94054790..091c212adc 100644 --- a/src/test/modules/snapshot_too_old/expected/sto_using_hash_index.out +++ b/src/test/modules/snapshot_too_old/expected/sto_using_hash_index.out @@ -1,15 +1,31 @@ -Parsed test spec with 2 sessions +Parsed test spec with 3 sessions -starting permutation: noseq s1f1 s2sleep s2u s1f2 +starting permutation: t00 noseq s1f1 t10 s2u s2v1 s1f2 t22 s2v2 s1f3 +step t00: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:00:00Z'); +test_sto_clobber_snapshot_timestamp + + step noseq: SET enable_seqscan = false; step s1f1: SELECT c FROM sto1 where c = 1000; c 1000 -step s2sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold'; -setting pg_sleep +step t10: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:10:00Z'); +test_sto_clobber_snapshot_timestamp -0 + step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1000; +step s2v1: VACUUM sto1; step s1f2: SELECT c FROM sto1 where c = 1001; +c + +step t22: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:22:00Z'); +test_sto_clobber_snapshot_timestamp + + +step s2v2: VACUUM sto1; +step s1f3: SELECT c FROM sto1 where c = 1001; ERROR: snapshot too old +test_sto_reset_all_state + + diff --git a/src/test/modules/snapshot_too_old/expected/sto_using_select.out b/src/test/modules/snapshot_too_old/expected/sto_using_select.out index eb15bc23bf..ba27bc5261 100644 --- a/src/test/modules/snapshot_too_old/expected/sto_using_select.out +++ b/src/test/modules/snapshot_too_old/expected/sto_using_select.out @@ -1,55 +1,164 @@ -Parsed test spec with 2 sessions +Parsed test spec with 3 sessions -starting permutation: s1f1 s1sleep s1f2 s2u +starting permutation: t00 s2vis1 s1f1 t01 s2u s2vis2 s1f2 t10 s2vac1 s2vis3 s1f3 t12 s1f4 s2vac2 s2vis4 +step t00: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:00:00Z'); +test_sto_clobber_snapshot_timestamp + + +step s2vis1: SELECT EVERY(all_visible) FROM pg_visibility_map('sto1'::regclass); +every + +t step s1f1: SELECT c FROM sto1 ORDER BY c LIMIT 1; c 1 -step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold'; -setting pg_sleep +step t01: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:01:00Z'); +test_sto_clobber_snapshot_timestamp + + +step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1; +step s2vis2: SELECT EVERY(all_visible) FROM pg_visibility_map('sto1'::regclass); +every -0 +f step s1f2: SELECT c FROM sto1 ORDER BY c LIMIT 1; c 1 -step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1; +step t10: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:10:00Z'); +test_sto_clobber_snapshot_timestamp -starting permutation: s1f1 s1sleep s2u s1f2 -step s1f1: SELECT c FROM sto1 ORDER BY c LIMIT 1; + +step s2vac1: VACUUM sto1; +step s2vis3: SELECT EVERY(all_visible) FROM pg_visibility_map('sto1'::regclass); +every + +f +step s1f3: SELECT c FROM sto1 ORDER BY c LIMIT 1; c 1 -step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold'; -setting pg_sleep +step t12: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:12:00Z'); +test_sto_clobber_snapshot_timestamp -0 -step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1; -step s1f2: SELECT c FROM sto1 ORDER BY c LIMIT 1; + +step s1f4: SELECT c FROM sto1 ORDER BY c LIMIT 1; ERROR: snapshot too old +step s2vac2: VACUUM sto1; +step s2vis4: SELECT EVERY(all_visible) FROM pg_visibility_map('sto1'::regclass); +every + +t +test_sto_reset_all_state + + + +starting permutation: t00 s2vis1 s1f1 t01 s2u s2vis2 s1f2 t10 s2vac1 s2vis3 s1f3 t12 s2vac2 s2vis4 s1f4 +step t00: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:00:00Z'); +test_sto_clobber_snapshot_timestamp -starting permutation: s1f1 s2u s1sleep s1f2 + +step s2vis1: SELECT EVERY(all_visible) FROM pg_visibility_map('sto1'::regclass); +every + +t step s1f1: SELECT c FROM sto1 ORDER BY c LIMIT 1; c 1 +step t01: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:01:00Z'); +test_sto_clobber_snapshot_timestamp + + step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1; -step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold'; -setting pg_sleep +step s2vis2: SELECT EVERY(all_visible) FROM pg_visibility_map('sto1'::regclass); +every -0 +f step s1f2: SELECT c FROM sto1 ORDER BY c LIMIT 1; +c + +1 +step t10: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:10:00Z'); +test_sto_clobber_snapshot_timestamp + + +step s2vac1: VACUUM sto1; +step s2vis3: SELECT EVERY(all_visible) FROM pg_visibility_map('sto1'::regclass); +every + +f +step s1f3: SELECT c FROM sto1 ORDER BY c LIMIT 1; +c + +1 +step t12: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:12:00Z'); +test_sto_clobber_snapshot_timestamp + + +step s2vac2: VACUUM sto1; +step s2vis4: SELECT EVERY(all_visible) FROM pg_visibility_map('sto1'::regclass); +every + +t +step s1f4: SELECT c FROM sto1 ORDER BY c LIMIT 1; ERROR: snapshot too old +test_sto_reset_all_state -starting permutation: s2u s1f1 s1sleep s1f2 -step s2u: UPDATE sto1 SET c = 1001 WHERE c = 1; + + +starting permutation: t00 s2vis1 s1f1 t01 s2vis2 s1f2 t10 s2vac1 s2vis3 s1f3 t12 s2vac2 s2vis4 s1f4 +step t00: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:00:00Z'); +test_sto_clobber_snapshot_timestamp + + +step s2vis1: SELECT EVERY(all_visible) FROM pg_visibility_map('sto1'::regclass); +every + +t step s1f1: SELECT c FROM sto1 ORDER BY c LIMIT 1; c -2 -step s1sleep: SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold'; -setting pg_sleep +1 +step t01: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:01:00Z'); +test_sto_clobber_snapshot_timestamp -0 + +step s2vis2: SELECT EVERY(all_visible) FROM pg_visibility_map('sto1'::regclass); +every + +t step s1f2: SELECT c FROM sto1 ORDER BY c LIMIT 1; -ERROR: snapshot too old +c + +1 +step t10: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:10:00Z'); +test_sto_clobber_snapshot_timestamp + + +step s2vac1: VACUUM sto1; +step s2vis3: SELECT EVERY(all_visible) FROM pg_visibility_map('sto1'::regclass); +every + +t +step s1f3: SELECT c FROM sto1 ORDER BY c LIMIT 1; +c + +1 +step t12: SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:12:00Z'); +test_sto_clobber_snapshot_timestamp + + +step s2vac2: VACUUM sto1; +step s2vis4: SELECT EVERY(all_visible) FROM pg_visibility_map('sto1'::regclass); +every + +t +step s1f4: SELECT c FROM sto1 ORDER BY c LIMIT 1; +c + +1 +test_sto_reset_all_state + + diff --git a/src/test/modules/snapshot_too_old/specs/sto_using_cursor.spec b/src/test/modules/snapshot_too_old/specs/sto_using_cursor.spec index eac18ca5b9..3be084b8fe 100644 --- a/src/test/modules/snapshot_too_old/specs/sto_using_cursor.spec +++ b/src/test/modules/snapshot_too_old/specs/sto_using_cursor.spec @@ -1,19 +1,14 @@ # This test provokes a "snapshot too old" error using a cursor. # -# The sleep is needed because with a threshold of zero a statement could error -# on changes it made. With more normal settings no external delay is needed, -# but we don't want these tests to run long enough to see that, since -# granularity is in minutes. -# -# Since results depend on the value of old_snapshot_threshold, sneak that into -# the line generated by the sleep, so that a surprising values isn't so hard -# to identify. +# Expects old_snapshot_threshold = 10. Not suitable for installcheck since it +# messes with internal snapmgr.c state. setup { + CREATE EXTENSION IF NOT EXISTS test_sto; + SELECT test_sto_reset_all_state(); CREATE TABLE sto1 (c int NOT NULL); INSERT INTO sto1 SELECT generate_series(1, 1000); - CREATE TABLE sto2 (c int NOT NULL); } setup { @@ -22,16 +17,29 @@ setup teardown { - DROP TABLE sto1, sto2; + DROP TABLE sto1; + SELECT test_sto_reset_all_state(); } session "s1" setup { BEGIN ISOLATION LEVEL REPEATABLE READ; } step "s1decl" { DECLARE cursor1 CURSOR FOR SELECT c FROM sto1; } step "s1f1" { FETCH FIRST FROM cursor1; } -step "s1sleep" { SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold'; } step "s1f2" { FETCH FIRST FROM cursor1; } +step "s1f3" { FETCH FIRST FROM cursor1; } teardown { COMMIT; } session "s2" step "s2u" { UPDATE sto1 SET c = 1001 WHERE c = 1; } + +session "time" +step "t00" { SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:00:00Z'); } +step "t10" { SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:10:00Z'); } +step "t20" { SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:20:00Z'); } + +# if there's an update, we get a snapshot too old error at time 00:20 (not before, +# because we need page pruning to see the xmin level change from 10 minutes earlier) +permutation "t00" "s1decl" "s1f1" "t10" "s2u" "s1f2" "t20" "s1f3" + +# if there's no update, no snapshot too old error at time 00:20 +permutation "t00" "s1decl" "s1f1" "t10" "s1f2" "t20" "s1f3" diff --git a/src/test/modules/snapshot_too_old/specs/sto_using_hash_index.spec b/src/test/modules/snapshot_too_old/specs/sto_using_hash_index.spec index 33d91ff852..f90bca3b7a 100644 --- a/src/test/modules/snapshot_too_old/specs/sto_using_hash_index.spec +++ b/src/test/modules/snapshot_too_old/specs/sto_using_hash_index.spec @@ -1,8 +1,12 @@ # This test is like sto_using_select, except that we test access via a -# hash index. +# hash index. Explicit vacuuming is required in this version because +# there is are no incidental calls to heap_page_prune_opt() that can +# call SetOldSnapshotThresholdTimestamp(). setup { + CREATE EXTENSION IF NOT EXISTS test_sto; + SELECT test_sto_reset_all_state(); CREATE TABLE sto1 (c int NOT NULL); INSERT INTO sto1 SELECT generate_series(1, 1000); CREATE INDEX idx_sto1 ON sto1 USING HASH (c); @@ -15,6 +19,7 @@ setup teardown { DROP TABLE sto1; + SELECT test_sto_reset_all_state(); } session "s1" @@ -22,10 +27,18 @@ setup { BEGIN ISOLATION LEVEL REPEATABLE READ; } step "noseq" { SET enable_seqscan = false; } step "s1f1" { SELECT c FROM sto1 where c = 1000; } step "s1f2" { SELECT c FROM sto1 where c = 1001; } +step "s1f3" { SELECT c FROM sto1 where c = 1001; } teardown { ROLLBACK; } session "s2" -step "s2sleep" { SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold'; } step "s2u" { UPDATE sto1 SET c = 1001 WHERE c = 1000; } +step "s2v1" { VACUUM sto1; } +step "s2v2" { VACUUM sto1; } -permutation "noseq" "s1f1" "s2sleep" "s2u" "s1f2" +session "time" +step "t00" { SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:00:00Z'); } +step "t10" { SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:10:00Z'); } +step "t22" { SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:22:00Z'); } + +# snapshot too old at t22 +permutation "t00" "noseq" "s1f1" "t10" "s2u" "s2v1" "s1f2" "t22" "s2v2" "s1f3" diff --git a/src/test/modules/snapshot_too_old/specs/sto_using_select.spec b/src/test/modules/snapshot_too_old/specs/sto_using_select.spec index d7c34f3d89..c7917d1b0a 100644 --- a/src/test/modules/snapshot_too_old/specs/sto_using_select.spec +++ b/src/test/modules/snapshot_too_old/specs/sto_using_select.spec @@ -1,19 +1,15 @@ # This test provokes a "snapshot too old" error using SELECT statements. # -# The sleep is needed because with a threshold of zero a statement could error -# on changes it made. With more normal settings no external delay is needed, -# but we don't want these tests to run long enough to see that, since -# granularity is in minutes. -# -# Since results depend on the value of old_snapshot_threshold, sneak that into -# the line generated by the sleep, so that a surprising values isn't so hard -# to identify. +# Expects old_snapshot_threshold = 10. Not suitable for installcheck since it +# messes with internal snapmgr.c state. setup { + CREATE EXTENSION IF NOT EXISTS test_sto; + CREATE EXTENSION IF NOT EXISTS pg_visibility; + SELECT test_sto_reset_all_state(); CREATE TABLE sto1 (c int NOT NULL); INSERT INTO sto1 SELECT generate_series(1, 1000); - CREATE TABLE sto2 (c int NOT NULL); } setup { @@ -22,15 +18,47 @@ setup teardown { - DROP TABLE sto1, sto2; + DROP TABLE sto1; + SELECT test_sto_reset_all_state(); } session "s1" setup { BEGIN ISOLATION LEVEL REPEATABLE READ; } step "s1f1" { SELECT c FROM sto1 ORDER BY c LIMIT 1; } -step "s1sleep" { SELECT setting, pg_sleep(6) FROM pg_settings WHERE name = 'old_snapshot_threshold'; } step "s1f2" { SELECT c FROM sto1 ORDER BY c LIMIT 1; } +step "s1f3" { SELECT c FROM sto1 ORDER BY c LIMIT 1; } +step "s1f4" { SELECT c FROM sto1 ORDER BY c LIMIT 1; } teardown { COMMIT; } session "s2" +step "s2vis1" { SELECT EVERY(all_visible) FROM pg_visibility_map('sto1'::regclass); } step "s2u" { UPDATE sto1 SET c = 1001 WHERE c = 1; } +step "s2vis2" { SELECT EVERY(all_visible) FROM pg_visibility_map('sto1'::regclass); } +step "s2vac1" { VACUUM sto1; } +step "s2vis3" { SELECT EVERY(all_visible) FROM pg_visibility_map('sto1'::regclass); } +step "s2vac2" { VACUUM sto1; } +step "s2vis4" { SELECT EVERY(all_visible) FROM pg_visibility_map('sto1'::regclass); } + +session "time" +step "t00" { SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:00:00Z'); } +step "t01" { SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:01:00Z'); } +step "t10" { SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:10:00Z'); } +step "t12" { SELECT test_sto_clobber_snapshot_timestamp('3000-01-01 00:12:00Z'); } + +# If there's an update, we get a snapshot too old error at time 00:12, and +# VACUUM is allowed to remove the tuple our snapshot could see, which we know +# because we see that the relation becomes all visible. The earlier VACUUMs +# were unable to remove the tuple we could see, which is is obvious because we +# can see the row with value 1, and from the relation not being all visible +# after the VACUUM. +permutation "t00" "s2vis1" "s1f1" "t01" "s2u" "s2vis2" "s1f2" "t10" "s2vac1" "s2vis3" "s1f3" "t12" "s1f4" "s2vac2" "s2vis4" + +# Almost the same schedule, but this time we'll put s2vac2 and s2vis4 before +# s1f4 just to demonstrate that the early pruning is allowed before the error +# aborts s1's transaction. +permutation "t00" "s2vis1" "s1f1" "t01" "s2u" "s2vis2" "s1f2" "t10" "s2vac1" "s2vis3" "s1f3" "t12" "s2vac2" "s2vis4" "s1f4" + +# If we run the same schedule as above but without the update, we get no +# snapshot too old error (even though our snapshot is older than the +# threshold), and the relation remains all visible. +permutation "t00" "s2vis1" "s1f1" "t01" "s2vis2" "s1f2" "t10" "s2vac1" "s2vis3" "s1f3" "t12" "s2vac2" "s2vis4" "s1f4" diff --git a/src/test/modules/snapshot_too_old/sto.conf b/src/test/modules/snapshot_too_old/sto.conf index 7eeaeeb0dc..5ed46b3560 100644 --- a/src/test/modules/snapshot_too_old/sto.conf +++ b/src/test/modules/snapshot_too_old/sto.conf @@ -1,2 +1,2 @@ autovacuum = off -old_snapshot_threshold = 0 +old_snapshot_threshold = 10 diff --git a/src/test/modules/snapshot_too_old/test_sto--1.0.sql b/src/test/modules/snapshot_too_old/test_sto--1.0.sql new file mode 100644 index 0000000000..c10afcf23a --- /dev/null +++ b/src/test/modules/snapshot_too_old/test_sto--1.0.sql @@ -0,0 +1,14 @@ +/* src/test/modules/snapshot_too_old/test_sto--1.0.sql */ + +-- complain if script is sourced in psql, rather than via CREATE EXTENSION +\echo Use "CREATE EXTENSION test_sto" to load this file. \quit + +CREATE FUNCTION test_sto_clobber_snapshot_timestamp(now timestamptz) +RETURNS VOID +AS 'MODULE_PATHNAME', 'test_sto_clobber_snapshot_timestamp' +LANGUAGE C STRICT; + +CREATE FUNCTION test_sto_reset_all_state() +RETURNS VOID +AS 'MODULE_PATHNAME', 'test_sto_reset_all_state' +LANGUAGE C STRICT; diff --git a/src/test/modules/snapshot_too_old/test_sto.c b/src/test/modules/snapshot_too_old/test_sto.c new file mode 100644 index 0000000000..f6c9a1a000 --- /dev/null +++ b/src/test/modules/snapshot_too_old/test_sto.c @@ -0,0 +1,74 @@ +/*------------------------------------------------------------------------- + * + * test_sto.c + * Functions to support isolation tests for snapshot too old. + * + * These functions are not intended for use in a production database and + * could cause corruption. + * + * Copyright (c) 2020, PostgreSQL Global Development Group + * + * src/test/modules/snapshot_too_old/test_sto.c + *------------------------------------------------------------------------- + */ +#include "postgres.h" + +#include "funcapi.h" +#include "storage/lwlock.h" +#include "utils/old_snapshot.h" +#include "utils/snapmgr.h" +#include "utils/timestamp.h" + +PG_MODULE_MAGIC; +PG_FUNCTION_INFO_V1(test_sto_reset_all_state); +PG_FUNCTION_INFO_V1(test_sto_clobber_snapshot_timestamp); + +/* + * Revert to initial state. This is not safe except in carefully + * controlled tests. + */ +Datum +test_sto_reset_all_state(PG_FUNCTION_ARGS) +{ + + LWLockAcquire(OldSnapshotTimeMapLock, LW_EXCLUSIVE); + oldSnapshotControl->count_used = 0; + oldSnapshotControl->current_timestamp = 0; + oldSnapshotControl->head_offset = 0; + oldSnapshotControl->head_timestamp = 0; + LWLockRelease(OldSnapshotTimeMapLock); + + SpinLockAcquire(&oldSnapshotControl->mutex_latest_xmin); + oldSnapshotControl->latest_xmin = InvalidTransactionId; + oldSnapshotControl->next_map_update = 0; + SpinLockRelease(&oldSnapshotControl->mutex_latest_xmin); + + SpinLockAcquire(&oldSnapshotControl->mutex_current); + oldSnapshotControl->current_timestamp = 0; + SpinLockRelease(&oldSnapshotControl->mutex_current); + + SpinLockAcquire(&oldSnapshotControl->mutex_threshold); + oldSnapshotControl->threshold_timestamp = 0; + oldSnapshotControl->threshold_xid = InvalidTransactionId; + SpinLockRelease(&oldSnapshotControl->mutex_threshold); + + PG_RETURN_NULL(); +} + +/* + * Update the minimum time used in snapshot-too-old code. If set ahead of the + * current wall clock time (for example, the year 3000), this allows testing + * with arbitrary times. This is not safe except in carefully controlled + * tests. + */ +Datum +test_sto_clobber_snapshot_timestamp(PG_FUNCTION_ARGS) +{ + TimestampTz new_current_timestamp = PG_GETARG_TIMESTAMPTZ(0); + + SpinLockAcquire(&oldSnapshotControl->mutex_current); + oldSnapshotControl->current_timestamp = new_current_timestamp; + SpinLockRelease(&oldSnapshotControl->mutex_current); + + PG_RETURN_NULL(); +} diff --git a/src/test/modules/snapshot_too_old/test_sto.control b/src/test/modules/snapshot_too_old/test_sto.control new file mode 100644 index 0000000000..e497e5450e --- /dev/null +++ b/src/test/modules/snapshot_too_old/test_sto.control @@ -0,0 +1,5 @@ +# test_sto test module +comment = 'functions for internal testing of snapshot too old errors' +default_version = '1.0' +module_pathname = '$libdir/test_sto' +relocatable = true -- 2.20.1
From 22ddbd6427965680fa49730609e71bd60cbc2d2b Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Mon, 20 Apr 2020 17:05:42 +1200 Subject: [PATCH v10 2/3] Truncate snapshot-too-old time map when CLOG is truncated. It's not safe to leave xids in the map that have wrapped around, although it's probably very hard to actually reach that state. Reported-by: Andres Freund <and...@anarazel.de> Discussion: https://postgr.es/m/CA%2BTgmoY%3Daqf0zjTD%2B3dUWYkgMiNDegDLFjo%2B6ze%3DWtpik%2B3XqA%40mail.gmail.com --- src/backend/commands/vacuum.c | 3 + src/backend/utils/time/snapmgr.c | 21 ++++ src/include/utils/snapmgr.h | 1 + src/test/modules/snapshot_too_old/Makefile | 4 +- .../snapshot_too_old/t/001_truncate.pl | 100 ++++++++++++++++++ 5 files changed, 128 insertions(+), 1 deletion(-) create mode 100644 src/test/modules/snapshot_too_old/t/001_truncate.pl diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index ddeec870d8..a79d1ef983 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -1679,6 +1679,9 @@ vac_truncate_clog(TransactionId frozenXID, */ AdvanceOldestCommitTsXid(frozenXID); + /* Make sure snapshot_too_old drops old XIDs. */ + TruncateOldSnapshotTimeMapping(frozenXID); + /* * Truncate CLOG, multixact and CommitTs to the oldest computed value. */ diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c index 1d80770136..0d50522da2 100644 --- a/src/backend/utils/time/snapmgr.c +++ b/src/backend/utils/time/snapmgr.c @@ -2000,6 +2000,27 @@ MaintainOldSnapshotTimeMapping(TimestampTz whenTaken, TransactionId xmin) } +/* + * Remove old xids from the timing map, so the CLOG can be truncated. + */ +void +TruncateOldSnapshotTimeMapping(TransactionId frozenXID) +{ + LWLockAcquire(OldSnapshotTimeMapLock, LW_EXCLUSIVE); + while (oldSnapshotControl->count_used > 0 && + TransactionIdPrecedes(oldSnapshotControl->xid_by_minute[oldSnapshotControl->head_offset], + frozenXID)) + { + oldSnapshotControl->head_timestamp += USECS_PER_MINUTE; + oldSnapshotControl->head_offset = + (oldSnapshotControl->head_offset + 1) % + OLD_SNAPSHOT_TIME_MAP_ENTRIES; + oldSnapshotControl->count_used--; + } + LWLockRelease(OldSnapshotTimeMapLock); +} + + /* * Setup a snapshot that replaces normal catalog snapshots that allows catalog * access to behave just like it did at a certain point in the past. diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h index b6b403e293..4560f1f03b 100644 --- a/src/include/utils/snapmgr.h +++ b/src/include/utils/snapmgr.h @@ -141,6 +141,7 @@ extern bool TransactionIdLimitedForOldSnapshots(TransactionId recentXmin, extern void SetOldSnapshotThresholdTimestamp(TimestampTz ts, TransactionId xlimit); extern void MaintainOldSnapshotTimeMapping(TimestampTz whenTaken, TransactionId xmin); +extern void TruncateOldSnapshotTimeMapping(TransactionId frozenXID); extern char *ExportSnapshot(Snapshot snapshot); diff --git a/src/test/modules/snapshot_too_old/Makefile b/src/test/modules/snapshot_too_old/Makefile index 81836e9953..f919944984 100644 --- a/src/test/modules/snapshot_too_old/Makefile +++ b/src/test/modules/snapshot_too_old/Makefile @@ -9,7 +9,7 @@ EXTENSION = test_sto DATA = test_sto--1.0.sql PGDESCFILE = "test_sto -- internal testing for snapshot too old errors" -EXTRA_INSTALL = contrib/pg_visibility +EXTRA_INSTALL = contrib/pg_visibility contrib/old_snapshot ISOLATION = sto_using_cursor sto_using_select sto_using_hash_index ISOLATION_OPTS = --temp-config $(top_srcdir)/src/test/modules/snapshot_too_old/sto.conf @@ -19,6 +19,8 @@ ISOLATION_OPTS = --temp-config $(top_srcdir)/src/test/modules/snapshot_too_old/s # because it'd be dangerous on a production system. NO_INSTALLCHECK = 1 +TAP_TESTS = 1 + ifdef USE_PGXS PG_CONFIG = pg_config PGXS := $(shell $(PG_CONFIG) --pgxs) diff --git a/src/test/modules/snapshot_too_old/t/001_truncate.pl b/src/test/modules/snapshot_too_old/t/001_truncate.pl new file mode 100644 index 0000000000..afcca232f2 --- /dev/null +++ b/src/test/modules/snapshot_too_old/t/001_truncate.pl @@ -0,0 +1,100 @@ +# Test truncation of the old snapshot time mapping, to check +# that we can't get into trouble when xids wrap around. +use strict; +use warnings; +use PostgresNode; +use TestLib; +use Test::More tests => 6; + +my $node = get_new_node('master'); +$node->init; +$node->append_conf("postgresql.conf", "timezone = UTC"); +$node->append_conf("postgresql.conf", "old_snapshot_threshold=10"); +$node->append_conf("postgresql.conf", "max_prepared_transactions=10"); +$node->append_conf("postgresql.conf", "autovacuum=off"); +$node->start; +$node->psql('postgres', 'update pg_database set datallowconn = true'); +$node->psql('postgres', 'create extension old_snapshot'); +$node->psql('postgres', 'create extension test_sto'); + +note "check time map is truncated when CLOG is"; + +sub set_time +{ + my $time = shift; + $node->psql('postgres', "select test_sto_clobber_snapshot_timestamp('$time')"); +} + +sub advance_xid +{ + my $time = shift; + $node->psql('postgres', "select pg_current_xact_id()"); +} + +sub summarize_mapping +{ + my $out; + $node->psql('postgres', + "select count(*), + to_char(min(end_timestamp), 'HH24:MI:SS'), + to_char(max(end_timestamp), 'HH24:MI:SS') + from pg_old_snapshot_time_mapping()", + stdout => \$out); + return $out; +} + +sub vacuum_freeze_all +{ + $node->psql('postgres', 'vacuum freeze'); + $node->psql('template0', 'vacuum freeze'); + $node->psql('template1', 'vacuum freeze'); +} + +# build up a time map with 4 entries +set_time('3000-01-01 00:00:00Z'); +advance_xid(); +set_time('3000-01-01 00:01:00Z'); +advance_xid(); +set_time('3000-01-01 00:02:00Z'); +advance_xid(); +set_time('3000-01-01 00:03:00Z'); +advance_xid(); +is(summarize_mapping(), "4|00:00:00|00:03:00"); + +# now cause frozen XID to advance +vacuum_freeze_all(); + +# we expect all XIDs to have been truncated +is(summarize_mapping(), "0||"); + +# put two more in the map +set_time('3000-01-01 00:04:00Z'); +advance_xid(); +set_time('3000-01-01 00:05:00Z'); +advance_xid(); +is(summarize_mapping(), "2|00:04:00|00:05:00"); + +# prepare a transaction, to stop xmin from getting further ahead +$node->psql('postgres', "begin; select pg_current_xact_id(); prepare transaction 'tx1'"); + +# add 16 more minutes; we should now have 18 +set_time('3000-01-01 00:21:00Z'); +advance_xid(); +is(summarize_mapping(), "18|00:04:00|00:21:00"); + +# now cause frozen XID to advance +vacuum_freeze_all(); + +# this should leave just 16, because 2 were truncated +is(summarize_mapping(), "16|00:06:00|00:21:00"); + +# commit tx1, and then freeze again to get rid of all of them +$node->psql('postgres', "commit prepared 'tx1'"); + +# now cause frozen XID to advance +vacuum_freeze_all(); + +# we should now be back to empty +is(summarize_mapping(), "0||"); + +$node->stop; -- 2.20.1
From 1287bee671dd572853ad31f5251497e2c5dc3b57 Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Tue, 21 Apr 2020 20:48:20 +1200 Subject: [PATCH v10 3/3] Add TAP test for snapshot-too-old time map maintenance. Add some simple sanity tests that failed before commit 55b7e2f4 "Fix two bugs in MaintainOldSnapshotTimeMapping." Discussion: https://postgr.es/m/CA%2BTgmoY%3Daqf0zjTD%2B3dUWYkgMiNDegDLFjo%2B6ze%3DWtpik%2B3XqA%40mail.gmail.com --- .../t/002_xid_map_maintenance.pl | 63 +++++++++++++++++++ 1 file changed, 63 insertions(+) create mode 100644 src/test/modules/snapshot_too_old/t/002_xid_map_maintenance.pl diff --git a/src/test/modules/snapshot_too_old/t/002_xid_map_maintenance.pl b/src/test/modules/snapshot_too_old/t/002_xid_map_maintenance.pl new file mode 100644 index 0000000000..267bec93b8 --- /dev/null +++ b/src/test/modules/snapshot_too_old/t/002_xid_map_maintenance.pl @@ -0,0 +1,63 @@ +# Test xid various time/xid map maintenance edge cases +# that were historically buggy. + +use strict; +use warnings; +use PostgresNode; +use TestLib; +use Test::More tests => 8; + +my $node = get_new_node('main'); +$node->init; +$node->append_conf("postgresql.conf", "timezone = UTC"); +$node->append_conf("postgresql.conf", "old_snapshot_threshold=10"); +$node->append_conf("postgresql.conf", "autovacuum = off"); +$node->start; +$node->psql('postgres', 'create extension test_sto'); +$node->psql('postgres', 'create extension old_snapshot'); + +sub set_time +{ + my $time = shift; + $node->psql('postgres', "select test_sto_clobber_snapshot_timestamp('$time')"); +} + +sub summarize_mapping +{ + my $out; + $node->psql('postgres', + "select count(*), + to_char(min(end_timestamp), 'HH24:MI:SS'), + to_char(max(end_timestamp), 'HH24:MI:SS') + from pg_old_snapshot_time_mapping()", + stdout => \$out); + return $out; +} + +# fill the map up to maximum capacity +set_time('3000-01-01 00:00:00Z'); +set_time('3000-01-01 00:19:00Z'); +is(summarize_mapping(), "20|00:00:00|00:19:00"); + +# make a jump larger than capacity; the mapping is blown away, +# and our new minute is now the only one +set_time('3000-01-01 02:00:00Z'); +is(summarize_mapping(), "1|02:00:00|02:00:00"); + +# test adding minutes while the map is not full +set_time('3000-01-01 02:01:00Z'); +is(summarize_mapping(), "2|02:00:00|02:01:00"); +set_time('3000-01-01 02:05:00Z'); +is(summarize_mapping(), "6|02:00:00|02:05:00"); +set_time('3000-01-01 02:19:00Z'); +is(summarize_mapping(), "20|02:00:00|02:19:00"); + +# test adding minutes while the map is full +set_time('3000-01-01 02:20:00Z'); +is(summarize_mapping(), "20|02:01:00|02:20:00"); +set_time('3000-01-01 02:22:00Z'); +is(summarize_mapping(), "20|02:03:00|02:22:00"); +set_time('3000-01-01 02:22:01Z'); # one second past +is(summarize_mapping(), "20|02:04:00|02:23:00"); + +$node->stop; -- 2.20.1