Hi all, A bug related to data visibility on standbys with a race condition mixing 2PC transactions and synchronous_standby_names with the checkpointer has been fixed a couple of days ago: https://www.postgresql.org/message-id/163fcbec-900b-4b07-beaa-d2ead8634...@postgrespro.ru
One issue that we had while discussing this thread is that there was no easy way to have a regression test because the problem requires a wait in the checkpointer at a very early startup phase where the shared memory status data related to s_s_names has *not* been initialized yet. I have been working on the problem and found out one nice way to address this limitation, introducing in the module injection_points a new function that flushes to a file the set of injection points currently attached to a cluster, reloading the data from the file to shmem early at startup when initializing the shmem state data through shared_preload_libraries. With that in place, it is then possible to make the checkpointer wait when it starts at a very early stage, giving a way to reproduce the original failure reported on the other thread: - A wait injection point is attached. - A flush is used to write the points' data to disk. - Node restarts, loading back their state. - The wait triggers in the checkpointer. So, please find attached a patch set for all that: - 0001 is a patch I have stolen from a different thread (see [1]), introducing InjectionPointList() that retrieves a list of the injection points attached. - 0002 extends injection_points with a new flush function, that can be used in TAP tests to persist some points across restarts. One sticky point is that I did not want to add any of this information in the core backend injection point APIs, nor to any of the backend structures because that's not necessary, and what's here is enough for some TAP tests. - 0003 adds a new regression test providing some coverage for 2e57790836c6. Reverting 2e57790836c6 causes the test to fail. This shows how to use this new facility. This is v19 work, so I am adding that to the next commit fest. Thanks, [1]: https://www.postgresql.org/message-id/z_xyka21kyleh...@paquier.xyz -- Michael
From daf866a6584e24bd69f4fa126ae1799f1ba0c912 Mon Sep 17 00:00:00 2001 From: Michael Paquier <mich...@paquier.xyz> Date: Tue, 22 Apr 2025 12:59:30 +0900 Subject: [PATCH 1/3] Add InjectionPointList() to retrieve list of injection points This hides the internals of the shmem array lookup, allocating the result in a palloc'd array usable by the caller. --- src/include/utils/injection_point.h | 16 +++++++++ src/backend/utils/misc/injection_point.c | 46 ++++++++++++++++++++++++ src/tools/pgindent/typedefs.list | 1 + 3 files changed, 63 insertions(+) diff --git a/src/include/utils/injection_point.h b/src/include/utils/injection_point.h index 6ba64cd1ebc6..3714739a8772 100644 --- a/src/include/utils/injection_point.h +++ b/src/include/utils/injection_point.h @@ -11,6 +11,19 @@ #ifndef INJECTION_POINT_H #define INJECTION_POINT_H +#include "nodes/pg_list.h" + +/* + * Injection point data, used when retrieving a list of all the attached + * injection points. + */ +typedef struct InjectionPointData +{ + const char *name; + const char *library; + const char *function; +} InjectionPointData; + /* * Injection points require --enable-injection-points. */ @@ -46,6 +59,9 @@ extern void InjectionPointCached(const char *name); extern bool IsInjectionPointAttached(const char *name); extern bool InjectionPointDetach(const char *name); +/* Get the current set of injection points attached */ +extern List *InjectionPointList(void); + #ifdef EXEC_BACKEND extern PGDLLIMPORT struct InjectionPointsCtl *ActiveInjectionPoints; #endif diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c index 97ab851f0a63..60b406e4c2e0 100644 --- a/src/backend/utils/misc/injection_point.c +++ b/src/backend/utils/misc/injection_point.c @@ -584,3 +584,49 @@ IsInjectionPointAttached(const char *name) return false; /* silence compiler */ #endif } + +/* + * Retrieve a list of all the injection points currently attached. + * + * This list is palloc'd in the current memory context. + */ +List * +InjectionPointList(void) +{ +#ifdef USE_INJECTION_POINTS + List *inj_points = NIL; + uint32 max_inuse; + + LWLockAcquire(InjectionPointLock, LW_SHARED); + + max_inuse = pg_atomic_read_u32(&ActiveInjectionPoints->max_inuse); + + for (uint32 idx = 0; idx < max_inuse; idx++) + { + InjectionPointEntry *entry; + InjectionPointData *inj_point; + uint64 generation; + + entry = &ActiveInjectionPoints->entries[idx]; + generation = pg_atomic_read_u64(&entry->generation); + + /* skip free slots */ + if (generation % 2 == 0) + continue; + + inj_point = (InjectionPointData *) palloc0(sizeof(InjectionPointData)); + inj_point->name = pstrdup(entry->name); + inj_point->library = pstrdup(entry->library); + inj_point->function = pstrdup(entry->function); + inj_points = lappend(inj_points, inj_point); + } + + LWLockRelease(InjectionPointLock); + + return inj_points; + +#else + elog(ERROR, "Injection points are not supported by this build"); + return NIL; /* keep compiler quiet */ +#endif +} diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index e5879e00dffe..31d5f8f71f46 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -1282,6 +1282,7 @@ InjectionPointCacheEntry InjectionPointCallback InjectionPointCondition InjectionPointConditionType +InjectionPointData InjectionPointEntry InjectionPointsCtl InjectionPointSharedState -- 2.49.0
From ded8b52a4212c32fe8f79ba68fb076332c3aa05d Mon Sep 17 00:00:00 2001 From: Michael Paquier <mich...@paquier.xyz> Date: Wed, 30 Apr 2025 12:27:33 +0900 Subject: [PATCH 2/3] injection_points: Add function to flush injection point data to disk The function injection_points_flush() can be called in a session to persist to disk all the injection points currently attached to the cluster. These are saved in file called injection_points.data at the root of PGDATA, and loaded by the cluster at startup with all the points registered automatically attached. This will be useful to test scenarios where injection points need to be for example attached at very early stages of startup, before it is possible to attach anything via SQL. --- .../injection_points--1.0.sql | 10 ++ .../injection_points/injection_points.c | 163 ++++++++++++++++++ src/test/modules/injection_points/meson.build | 1 + .../injection_points/t/002_data_persist.pl | 53 ++++++ 4 files changed, 227 insertions(+) create mode 100644 src/test/modules/injection_points/t/002_data_persist.pl diff --git a/src/test/modules/injection_points/injection_points--1.0.sql b/src/test/modules/injection_points/injection_points--1.0.sql index 5d83f08811b0..e99d44507be1 100644 --- a/src/test/modules/injection_points/injection_points--1.0.sql +++ b/src/test/modules/injection_points/injection_points--1.0.sql @@ -3,6 +3,16 @@ -- complain if script is sourced in psql, rather than via CREATE EXTENSION \echo Use "CREATE EXTENSION injection_points" to load this file. \quit +-- +-- injection_points_flush() +-- +-- Flush to disk all the data of the injection points attached. +-- +CREATE FUNCTION injection_points_flush() +RETURNS void +AS 'MODULE_PATHNAME', 'injection_points_flush' +LANGUAGE C STRICT; + -- -- injection_points_attach() -- diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c index ad528d77752b..fb239dca5537 100644 --- a/src/test/modules/injection_points/injection_points.c +++ b/src/test/modules/injection_points/injection_points.c @@ -24,6 +24,7 @@ #include "nodes/value.h" #include "storage/condition_variable.h" #include "storage/dsm_registry.h" +#include "storage/fd.h" #include "storage/ipc.h" #include "storage/lwlock.h" #include "storage/shmem.h" @@ -39,6 +40,14 @@ PG_MODULE_MAGIC; #define INJ_MAX_WAIT 8 #define INJ_NAME_MAXLEN 64 +/* Location of injection point data files, if flush has been requested */ +#define INJ_DUMP_FILE "injection_points.data" +#define INJ_DUMP_FILE_TMP INJ_DUMP_FILE ".tmp" + +/* Magic number identifying the injection file */ +static const uint32 INJ_FILE_HEADER = 0xFF345678; + + /* * Conditions related to injection points. This tracks in shared memory the * runtime conditions under which an injection point is allowed to run, @@ -148,6 +157,9 @@ static void injection_shmem_startup(void) { bool found; + int32 num_inj_points; + uint32 header; + FILE *file; if (prev_shmem_startup_hook) prev_shmem_startup_hook(); @@ -169,6 +181,84 @@ injection_shmem_startup(void) } LWLockRelease(AddinShmemInitLock); + + /* + * Done if some other process already completed the initialization. + */ + if (found) + return; + + /* + * Note: there should be no need to bother with locks here, because there + * should be no other processes running when this code is reached. + */ + + /* Load injection point data, if any has been found while starting up */ + file = AllocateFile(INJ_DUMP_FILE, PG_BINARY_R); + + if (file == NULL) + { + if (errno != ENOENT) + goto error; + + /* No file? We are done. */ + return; + } + + if (fread(&header, sizeof(uint32), 1, file) != 1 || + fread(&num_inj_points, sizeof(int32), 1, file) != 1) + goto error; + + if (header != INJ_FILE_HEADER) + goto error; + + for (int i = 0; i < num_inj_points; i++) + { + const char *name; + const char *library; + const char *function; + uint32 len; + char buf[1024]; + + if (fread(&len, sizeof(uint32), 1, file) != 1) + goto error; + if (fread(buf, 1, len + 1, file) != len + 1) + goto error; + name = pstrdup(buf); + + if (fread(&len, sizeof(uint32), 1, file) != 1) + goto error; + if (fread(buf, 1, len + 1, file) != len + 1) + goto error; + library = pstrdup(buf); + + if (fread(&len, sizeof(uint32), 1, file) != 1) + goto error; + if (fread(buf, 1, len + 1, file) != len + 1) + goto error; + function = pstrdup(buf); + + /* No private data handled here */ + InjectionPointAttach(name, library, function, NULL, 0); + } + + /* + * Remove the persisted injection point file, we do not need it anymore. + */ + unlink(INJ_DUMP_FILE); + FreeFile(file); + + return; + +error: + ereport(LOG, + (errcode_for_file_access(), + errmsg("could not read file \"%s\": %m", + INJ_DUMP_FILE))); + if (file) + FreeFile(file); + + unlink(INJ_DUMP_FILE); } /* @@ -330,6 +420,79 @@ injection_wait(const char *name, const void *private_data) SpinLockRelease(&inj_state->lock); } +/* + * SQL function for flushing injection point data to disk. + */ +PG_FUNCTION_INFO_V1(injection_points_flush); +Datum +injection_points_flush(PG_FUNCTION_ARGS) +{ + FILE *file = NULL; + List *inj_points = NIL; + ListCell *lc; + int32 num_inj_points; + + inj_points = InjectionPointList(); + if (inj_points == NIL) + PG_RETURN_VOID(); + + num_inj_points = list_length(inj_points); + + file = AllocateFile(INJ_DUMP_FILE ".tmp", PG_BINARY_W); + if (file == NULL) + goto error; + + if (fwrite(&INJ_FILE_HEADER, sizeof(uint32), 1, file) != 1) + goto error; + + if (fwrite(&num_inj_points, sizeof(int32), 1, file) != 1) + goto error; + + foreach(lc, inj_points) + { + InjectionPointData *inj_point = lfirst(lc); + uint32 len; + + len = strlen(inj_point->name); + if (fwrite(&len, sizeof(uint32), 1, file) != 1 || + fwrite(inj_point->name, 1, len + 1, file) != len + 1) + goto error; + + len = strlen(inj_point->library); + if (fwrite(&len, sizeof(uint32), 1, file) != 1 || + fwrite(inj_point->library, 1, len + 1, file) != len + 1) + goto error; + + len = strlen(inj_point->function); + if (fwrite(&len, sizeof(uint32), 1, file) != 1 || + fwrite(inj_point->function, 1, len + 1, file) != len + 1) + goto error; + } + + if (FreeFile(file)) + { + file = NULL; + goto error; + } + + /* + * Rename file into place, so we atomically replace any old one. + */ + (void) durable_rename(INJ_DUMP_FILE_TMP, INJ_DUMP_FILE, LOG); + + PG_RETURN_VOID(); + +error: + ereport(LOG, + (errcode_for_file_access(), + errmsg("could not write file \"%s\": %m", + INJ_DUMP_FILE_TMP))); + if (file) + FreeFile(file); + unlink(INJ_DUMP_FILE_TMP); + PG_RETURN_VOID(); +} + /* * SQL function for creating an injection point. */ diff --git a/src/test/modules/injection_points/meson.build b/src/test/modules/injection_points/meson.build index d61149712fd7..b994f8d413d3 100644 --- a/src/test/modules/injection_points/meson.build +++ b/src/test/modules/injection_points/meson.build @@ -56,6 +56,7 @@ tests += { }, 'tests': [ 't/001_stats.pl', + 't/002_data_persist.pl', ], }, } diff --git a/src/test/modules/injection_points/t/002_data_persist.pl b/src/test/modules/injection_points/t/002_data_persist.pl new file mode 100644 index 000000000000..9ecb05230931 --- /dev/null +++ b/src/test/modules/injection_points/t/002_data_persist.pl @@ -0,0 +1,53 @@ + +# Copyright (c) 2024-2025, PostgreSQL Global Development Group + +# Tests for persistence of injection point data. + +use strict; +use warnings FATAL => 'all'; +use locale; + +use PostgreSQL::Test::Cluster; +use PostgreSQL::Test::Utils; +use Test::More; + +# Test persistency of statistics generated for injection points. +if ($ENV{enable_injection_points} ne 'yes') +{ + plan skip_all => 'Injection points not supported by this build'; +} + +# Node initialization +my $node = PostgreSQL::Test::Cluster->new('master'); +$node->init; +$node->append_conf( + 'postgresql.conf', qq( +shared_preload_libraries = 'injection_points' +)); +$node->start; +$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;'); + +# Attach a couple of points, which are going to be made persistent. +$node->safe_psql('postgres', + "SELECT injection_points_attach('persist-notice', 'notice');"); +$node->safe_psql('postgres', + "SELECT injection_points_attach('persist-error', 'error');"); +$node->safe_psql('postgres', + "SELECT injection_points_attach('persist-notice-2', 'notice');"); + +# Flush and restart, the injection points still exist. +$node->safe_psql('postgres', "SELECT injection_points_flush();"); +$node->restart; + +my ($result, $stdout, $stderr) = + $node->psql('postgres', "SELECT injection_points_run('persist-notice-2')"); +ok( $stderr =~ + /NOTICE: notice triggered for injection point persist-notice-2/, + "injection point triggering NOTICE exists"); + +($result, $stdout, $stderr) = + $node->psql('postgres', "SELECT injection_points_run('persist-error')"); +ok($stderr =~ /ERROR: error triggered for injection point persist-error/, + "injection point triggering ERROR exists"); + +done_testing(); -- 2.49.0
From 67136a69201f0fda22cd7c79e5e02e2d1bfc3380 Mon Sep 17 00:00:00 2001 From: Michael Paquier <mich...@paquier.xyz> Date: Wed, 30 Apr 2025 14:15:23 +0900 Subject: [PATCH 3/3] Add regression test for 2PC visibility check on standby This adds some test coverage for a defect fixed in 2e57790836c6, where the only reliable test back then was to use a hardcoded sleep(). This test relies on an injection point that is persisted across a node restart, so as it is possible to cause the checkpointer to wait when configuring the shared memory state of shared_preload_libraries at a very early startup stage. Reverting 2e57790836c6 causes the test to fail, and it passes on HEAD. --- src/backend/replication/syncrep.c | 3 ++ src/test/recovery/t/009_twophase.pl | 60 +++++++++++++++++++++++++++++ 2 files changed, 63 insertions(+) diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c index cc35984ad008..d02633619c28 100644 --- a/src/backend/replication/syncrep.c +++ b/src/backend/replication/syncrep.c @@ -84,6 +84,7 @@ #include "storage/proc.h" #include "tcop/tcopprot.h" #include "utils/guc_hooks.h" +#include "utils/injection_point.h" #include "utils/ps_status.h" /* User-settable parameters for sync rep */ @@ -968,6 +969,8 @@ SyncRepUpdateSyncStandbysDefined(void) if (sync_standbys_defined != ((WalSndCtl->sync_standbys_status & SYNC_STANDBY_DEFINED) != 0)) { + INJECTION_POINT("checkpointer-syncrep-update"); + LWLockAcquire(SyncRepLock, LW_EXCLUSIVE); /* diff --git a/src/test/recovery/t/009_twophase.pl b/src/test/recovery/t/009_twophase.pl index 1a662ebe499d..f14da1549bac 100644 --- a/src/test/recovery/t/009_twophase.pl +++ b/src/test/recovery/t/009_twophase.pl @@ -51,6 +51,27 @@ $node_paris->append_conf( )); $node_paris->start; +# Check if the extension injection_points is available, as it may be +# possible that this script is run with installcheck, where the module +# would not be installed by default. +my $injection_points_supported = + $node_london->check_extension('injection_points'); +if ($injection_points_supported != 0) +{ + $node_london->safe_psql('postgres', 'CREATE EXTENSION injection_points;'); + + # Set shared_preload_libraries, to allow the injection points to persist + # across restarts. + $node_london->append_conf( + 'postgresql.conf', qq( + shared_preload_libraries = 'injection_points' + )); + $node_paris->append_conf( + 'postgresql.conf', qq( + shared_preload_libraries = 'injection_points' + )); +} + # Switch to synchronous replication in both directions configure_and_reload($node_london, "synchronous_standby_names = 'paris'"); configure_and_reload($node_paris, "synchronous_standby_names = 'london'"); @@ -327,6 +348,23 @@ $cur_primary->psql( INSERT INTO t_009_tbl_standby_mvcc VALUES (2, 'issued to ${cur_primary_name}'); PREPARE TRANSACTION 'xact_009_standby_mvcc'; "); + +# Attach an injection point to wait in the checkpointer when configuring +# the shared memory state data related to synchronous_standby_names, then +# persist the attached point to disk so as the follow-up restart will be able +# to wait at the early stages of the checkpointer startup sequence. +# +# Note that as the checkpointer has already applied the +# synchronous_standby_names configuration, this has no effect until the +# next startup of the primary. +if ($injection_points_supported != 0) +{ + $cur_primary->psql('postgres', + "SELECT injection_points_attach('checkpointer-syncrep-update', 'wait')" + ); + $cur_primary->psql('postgres', "SELECT injection_points_flush()"); +} + $cur_primary->stop; $cur_standby->restart; @@ -341,6 +379,16 @@ is($psql_out, '0', # Commit the transaction in primary $cur_primary->start; + +# Make sure that the checkpointer is waiting before setting up the data of +# synchronous_standby_names in shared memory. We want the checkpointer to be +# stuck and make sure that the next COMMIT PREPARED is detected correctly on +# the standby when remote_apply is set on the primary. +if ($injection_points_supported != 0) +{ + $cur_primary->wait_for_event('checkpointer', + 'checkpointer-syncrep-update'); +} $cur_primary->psql( 'postgres', " SET synchronous_commit='remote_apply'; -- To ensure the standby is caught up @@ -361,6 +409,18 @@ is($psql_out, '2', "Committed prepared transaction is visible to new snapshot in standby"); $standby_session->quit; +# Remove the injection point, the checkpointer now applies the configuration +# related to synchronous_standby_names in shared memory. +if ($injection_points_supported != 0) +{ + $cur_primary->psql('postgres', + "SELECT injection_points_wakeup('checkpointer-syncrep-update')"); + $cur_primary->psql('postgres', + "SELECT injection_points_detach('checkpointer-syncrep-update')"); +} + +$cur_standby->restart; + ############################################################################### # Check for a lock conflict between prepared transaction with DDL inside and # replay of XLOG_STANDBY_LOCK wal record. -- 2.49.0
signature.asc
Description: PGP signature