On Mon, May 5, 2025 at 11:39 AM David G. Johnston
<david.g.johns...@gmail.com> wrote:
>
> On Sun, May 4, 2025 at 8:45 PM Nisha Moond <nisha.moond...@gmail.com> wrote:
>>
>> Attached is the patch implementing the above proposed solution.
>> Reviews and feedback are most welcome.
>
>
> I feel like this is just papering over the issue - which is that these two 
> drop functions are being used for multiple differently named 
> publications/slots yet take no care to ensure they only change 
> made_publication and made_repslot if the name of the object being passed in 
> matches the name of the object those two booleans are specifically tracking 
> (the application created objects on the publisher).
>
> Make it so they are only changed to false if the name matches the one the 
> program created and the connection is the primary connection.  That targets 
> the real issue and avoids using a branching boolean parameter.
>
> It seems really odd to say: if (in_cleanup) "don't try again" - since by 
> definition this is the last thing we are doing before we exit.  So really 
> what this patch can do more simply is just remove the 
> dbinfo->made_replslot=false and *made_publication=false lines altogether - 
> which might be a valid option.
>

+1 to removing the dbinfo->made_replslot=false and
*made_publication=false lines. In my tests, I attempted to force
multiple failures, but couldn’t find any case where
cleanup_objects_atexit() would recurse or cause repeated cleanup if
these flags remain set to true.

> I'm partial to the latter really, I don't think the error message output for 
> retrying a drop that may have previously failed would be an issue.
>

As of now, we don’t attempt to drop the same object more than once, so
the latter approach does seem reasonable to me. That said, I’m unsure
why the flags were being reset here in the first place.

Please find the updated patch which removes the false setting of these
flags during drop. If there’s a case I’ve overlooked where this might
be problematic, we can certainly go for your first suggestion to match
the names.

--
Thanks,
Nisha
From 6d50f40d8bd6adc693c607b46600a46fe8337a75 Mon Sep 17 00:00:00 2001
From: Nisha Moond <nisha.moond412@gmail.com>
Date: Tue, 6 May 2025 09:34:14 +0530
Subject: [PATCH v2] Fix incorrect cleanup flag handling in pg_createsubscriber

The flags made_publication and made_replslot track whether the tool
created a publication or replication slot on the primary. These are
used during error handling to clean up internal objects on primary.

Previously, these flags were incorrectly reset to false when failures
occurred while dropping objects(publications/replication slots) on the
subscriber. As a result, upon a failure, the cleanup_objects_atexit()
skipped cleanup of these objects on the primary.

This patch removes the resetting of these cleanup flags during
drop_replication_slot and drop_publication, ensuring proper cleanup of
primary objects before exit on failure.
---
 src/bin/pg_basebackup/pg_createsubscriber.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_createsubscriber.c b/src/bin/pg_basebackup/pg_createsubscriber.c
index f65acc7cb11..20a8ad8ed4d 100644
--- a/src/bin/pg_basebackup/pg_createsubscriber.c
+++ b/src/bin/pg_basebackup/pg_createsubscriber.c
@@ -1433,7 +1433,6 @@ drop_replication_slot(PGconn *conn, struct LogicalRepInfo *dbinfo,
 		{
 			pg_log_error("could not drop replication slot \"%s\" in database \"%s\": %s",
 						 slot_name, dbinfo->dbname, PQresultErrorMessage(res));
-			dbinfo->made_replslot = false;	/* don't try again. */
 		}
 
 		PQclear(res);
@@ -1701,7 +1700,6 @@ drop_publication(PGconn *conn, const char *pubname, const char *dbname,
 		{
 			pg_log_error("could not drop publication \"%s\" in database \"%s\": %s",
 						 pubname, dbname, PQresultErrorMessage(res));
-			*made_publication = false;	/* don't try again. */
 
 			/*
 			 * Don't disconnect and exit here. This routine is used by primary
-- 
2.34.1

Reply via email to