On Tue, Dec 30, 2025 at 02:34:21PM +0700, Alena Vinter wrote:
> I’m now back on track and have refined the implementation based on earlier
> discussions. The current version fully adopts the `include_if_exists`
> approach, writes temporary recovery settings to a separate config file, and
> disables it on exit by renaming to `.disabled`.

It seems roughly OK, so I have put my hands on it.  A couple of notes
regarding the things done in the updated version attached:
- Addition of two tests to check pg_createsubscriber.conf.disabled in
the data folders of node S and K.
- More description in the tests.
- The "dry run mode" node has disappeared from the recovery parameter
StringInfo, so added it back at the top of the parameters generated.
- Missing a newline after the include_if_exists.
- dirable_rename() logs already something on failure, I see no need
for an extra warning to say the same.  Adding the warning telling that
a manual intervention may be required is good, though.
- Let's group the document change with the main patch.
- More stylistic changes, comments and code.
- The new test fails if we undo the changes in pg_createsubscriber.c,
as we'd want, in the shape of node K FATAL-ing due to an incorrect
recovery configuration fed to the node.

Alena, what do you think?
--
Michael
From e4601823d84b8226f3a230ddb131c00dfdd8b46b Mon Sep 17 00:00:00 2001
From: Alena Vinter <[email protected]>
Date: Tue, 30 Dec 2025 13:56:18 +0700
Subject: [PATCH v10] pg_createsubscriber: use include_if_exists for recovery
 config and disable on exit

Write temporary recovery parameters to pg_createsubscriber.conf and
include it from postgresql.auto.conf via include_if_exists.  Upon
completion or failure, rename the file to
pg_createsubscriber.conf.disabled so it is ignored on subsequent
restarts.
---
 src/bin/pg_basebackup/pg_createsubscriber.c   | 100 ++++++++++++++----
 .../t/040_pg_createsubscriber.pl              |  46 ++++++++
 doc/src/sgml/ref/pg_createsubscriber.sgml     |  12 ++-
 3 files changed, 132 insertions(+), 26 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index 9a825bfd166e..3ce24860c04b 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -20,6 +20,7 @@
 
 #include "common/connect.h"
 #include "common/controldata_utils.h"
+#include "common/file_utils.h"
 #include "common/logging.h"
 #include "common/pg_prng.h"
 #include "common/restricted_token.h"
@@ -33,6 +34,21 @@
 #define	DEFAULT_SUB_PORT	"50432"
 #define	OBJECTTYPE_PUBLICATIONS  0x0001
 
+/*
+ * Configuration files for recovery parameters.
+ *
+ * The recovery parameters are set in INCLUDED_CONF_FILE, itself loaded by
+ * the server through an include_if_exists in postgresql.auto.conf.
+ *
+ * INCLUDED_CONF_FILE is renamed to INCLUDED_CONF_FILE_DISABLED when exiting,
+ * so as the recovery parameters set by this tool never take effect on node
+ * restart.  The contents of INCLUDED_CONF_FILE_DISABLED can be useful for
+ * debugging.
+ */
+#define PG_AUTOCONF_FILENAME		"postgresql.auto.conf"
+#define INCLUDED_CONF_FILE			"pg_createsubscriber.conf"
+#define INCLUDED_CONF_FILE_DISABLED	INCLUDED_CONF_FILE ".disabled"
+
 /* Command-line options */
 struct CreateSubscriberOptions
 {
@@ -156,22 +172,43 @@ static char *subscriber_dir = NULL;
 
 static bool recovery_ended = false;
 static bool standby_running = false;
+static bool recovery_params_set = false;
 
 
 /*
- * Cleanup objects that were created by pg_createsubscriber if there is an
- * error.
+ * Clean up objects created by pg_createsubscriber.
  *
- * Publications and replication slots are created on primary. Depending on the
- * step it failed, it should remove the already created objects if it is
- * possible (sometimes it won't work due to a connection issue).
- * There is no cleanup on the target server. The steps on the target server are
- * executed *after* promotion, hence, at this point, a failure means recreate
- * the physical replica and start again.
+ * Publications and replication slots are created on the primary.  Depending
+ * on the step where it failed, already-created objects should be removed if
+ * possible (sometimes this won't work due to a connection issue).
+ * There is no cleanup on the target server *after* its promotion, because any
+ * failure at this point means recreating the physical replica and starting
+ * again.
+ *
+ * The recovery configuration is always removed, by renaming the included
+ * configuration file out of the way.
  */
 static void
 cleanup_objects_atexit(void)
 {
+	/* Rename the included configuration file, if necessary. */
+	if (recovery_params_set)
+	{
+		char		conf_filename[MAXPGPATH];
+		char		conf_filename_disabled[MAXPGPATH];
+
+		snprintf(conf_filename, MAXPGPATH, "%s/%s", subscriber_dir,
+				 INCLUDED_CONF_FILE);
+		snprintf(conf_filename_disabled, MAXPGPATH, "%s/%s", subscriber_dir,
+				 INCLUDED_CONF_FILE_DISABLED);
+
+		if (durable_rename(conf_filename, conf_filename_disabled) != 0)
+		{
+			/* durable_rename() has already logged something. */
+			pg_log_warning_hint("A manual removal of the recovery parameters may be required.");
+		}
+	}
+
 	if (success)
 		return;
 
@@ -1303,6 +1340,10 @@ setup_recovery(const struct LogicalRepInfo *dbinfo, const char *datadir, const c
 	 * targets (name, time, xid, LSN).
 	 */
 	recoveryconfcontents = GenerateRecoveryConfig(conn, NULL, NULL);
+
+	if (dry_run)
+		appendPQExpBufferStr(recoveryconfcontents, "# dry run mode\n");
+
 	appendPQExpBufferStr(recoveryconfcontents, "recovery_target = ''\n");
 	appendPQExpBufferStr(recoveryconfcontents,
 						 "recovery_target_timeline = 'latest'\n");
@@ -1322,23 +1363,36 @@ setup_recovery(const struct LogicalRepInfo *dbinfo, const char *datadir, const c
 	appendPQExpBufferStr(recoveryconfcontents, "recovery_target_name = ''\n");
 	appendPQExpBufferStr(recoveryconfcontents, "recovery_target_time = ''\n");
 	appendPQExpBufferStr(recoveryconfcontents, "recovery_target_xid = ''\n");
-
-	if (dry_run)
-	{
-		appendPQExpBufferStr(recoveryconfcontents, "# dry run mode\n");
-		appendPQExpBuffer(recoveryconfcontents,
-						  "recovery_target_lsn = '%X/%08X'\n",
-						  LSN_FORMAT_ARGS((XLogRecPtr) InvalidXLogRecPtr));
-	}
-	else
-	{
-		appendPQExpBuffer(recoveryconfcontents, "recovery_target_lsn = '%s'\n",
-						  lsn);
-		WriteRecoveryConfig(conn, datadir, recoveryconfcontents);
-	}
-	disconnect_database(conn, false);
+	appendPQExpBuffer(recoveryconfcontents, "recovery_target_lsn = '%s'\n", lsn);
 
 	pg_log_debug("recovery parameters:\n%s", recoveryconfcontents->data);
+
+	if (!dry_run)
+	{
+		char		conf_filename[MAXPGPATH];
+		FILE	   *fd;
+
+		/* Write the recovery parameters to INCLUDED_CONF_FILE */
+		snprintf(conf_filename, MAXPGPATH, "%s/%s", datadir,
+				 INCLUDED_CONF_FILE);
+		fd = fopen(conf_filename, "w");
+		if (fd == NULL)
+			pg_fatal("could not open file \"%s\": %m", conf_filename);
+
+		if (fwrite(recoveryconfcontents->data, recoveryconfcontents->len, 1, fd) != 1)
+			pg_fatal("could not write to file \"%s\": %m", conf_filename);
+
+		fclose(fd);
+		recovery_params_set = true;
+
+		/* Include conditionally the recovery parameters. */
+		resetPQExpBuffer(recoveryconfcontents);
+		appendPQExpBufferStr(recoveryconfcontents,
+							 "include_if_exists '" INCLUDED_CONF_FILE "'\n");
+		WriteRecoveryConfig(conn, datadir, recoveryconfcontents);
+	}
+
+	disconnect_database(conn, false);
 }
 
 /*
diff --git a/src/bin/pg_basebackup/t/040_pg_createsubscriber.pl b/src/bin/pg_basebackup/t/040_pg_createsubscriber.pl
index a905183b738e..e1de946488ed 100644
--- a/src/bin/pg_basebackup/t/040_pg_createsubscriber.pl
+++ b/src/bin/pg_basebackup/t/040_pg_createsubscriber.pl
@@ -160,6 +160,7 @@ primary_slot_name = '$slotname'
 primary_conninfo = '$pconnstr dbname=postgres'
 hot_standby_feedback = on
 ]);
+my $sconnstr = $node_s->connstr;
 $node_s->set_standby_mode();
 $node_s->start;
 
@@ -472,6 +473,11 @@ command_ok(
 	],
 	'run pg_createsubscriber on node S');
 
+# Check that included file is renamed after success.
+my $node_s_datadir = $node_s->data_dir;
+ok( -f "$node_s_datadir/pg_createsubscriber.conf.disabled",
+	"pg_createsubscriber.conf.disabled exists in node S");
+
 # Confirm the physical replication slot has been removed
 $result = $node_p->safe_psql($db1,
 	"SELECT count(*) FROM pg_replication_slots WHERE slot_name = '$slotname'"
@@ -579,10 +585,50 @@ is($result, qq($db1|{test_pub3}
 $db2|{pub2}),
 	"subscriptions use the correct publications");
 
+# Verify that node K, set as a standby, is able to start correctly without
+# the recovery configuration written by pg_createsubscriber interfering.
+# This node is created from node S, where pg_createsubscriber has been run.
+
+# Create a physical standby from the promoted subscriber
+$node_s->safe_psql('postgres',
+	"SELECT pg_create_physical_replication_slot('$slotname');");
+
+# Create backup from promoted subscriber
+$node_s->backup('backup_3');
+
+# Initialize new physical standby
+my $node_k = PostgreSQL::Test::Cluster->new('node_k');
+$node_k->init_from_backup($node_s, 'backup_3', has_streaming => 1);
+
+my $node_k_datadir = $node_k->data_dir;
+ok( -f "$node_k_datadir/pg_createsubscriber.conf.disabled",
+	"pg_createsubscriber.conf.disabled exists in node K");
+
+# Configure the new standby
+$node_k->append_conf(
+	'postgresql.conf', qq[
+primary_slot_name = '$slotname'
+primary_conninfo = '$sconnstr dbname=postgres'
+hot_standby_feedback = on
+]);
+
+$node_k->set_standby_mode();
+my $node_k_name = $node_s->name;
+command_ok(
+	[
+		'pg_ctl', '--wait',
+		'--pgdata' => $node_k->data_dir,
+		'--log' => $node_k->logfile,
+		'--options' => "--cluster-name=$node_k_name",
+		'start'
+	],
+	"node K has started");
+
 # clean up
 $node_p->teardown_node;
 $node_s->teardown_node;
 $node_t->teardown_node;
 $node_f->teardown_node;
+$node_k->teardown_node;
 
 done_testing();
diff --git a/doc/src/sgml/ref/pg_createsubscriber.sgml b/doc/src/sgml/ref/pg_createsubscriber.sgml
index e450c6a5b376..cf45ff3573d1 100644
--- a/doc/src/sgml/ref/pg_createsubscriber.sgml
+++ b/doc/src/sgml/ref/pg_createsubscriber.sgml
@@ -518,8 +518,11 @@ PostgreSQL documentation
 
     <step>
      <para>
-      Write recovery parameters into the target data directory and restart the
-      target server.  It specifies an LSN (<xref
+      Write recovery parameters into the separate configuration file
+      <filename>pg_createsubscriber.conf</filename> that is included from
+      <filename>postgresql.auto.conf</filename> using
+      <literal>include_if_exists</literal> in the target data directory, then
+      restart the target server.  It specifies an LSN (<xref
       linkend="guc-recovery-target-lsn"/>) of the write-ahead log location up
       to which recovery will proceed.  It also specifies
       <literal>promote</literal> as the action that the server should take
@@ -532,7 +535,10 @@ PostgreSQL documentation
       server ends standby mode and is accepting read-write transactions.  If
       <option>--recovery-timeout</option> option is set,
       <application>pg_createsubscriber</application> terminates if recovery
-      does not end until the given number of seconds.
+      does not end until the given number of seconds.  Upon completion, the
+      included configuration file is renamed to
+      <filename>pg_createsubscriber.conf.disabled</filename> so as it is no
+      longer loaded on subsequent restarts.
      </para>
     </step>
 
-- 
2.51.0

Attachment: signature.asc
Description: PGP signature

Reply via email to