Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.

Shubham Khanna Wed, 19 Feb 2025 22:52:18 -0800

On Mon, Feb 17, 2025 at 9:49 AM Peter Smith <smithpb2...@gmail.com> wrote:
>
> Hi Shubham
>
> Some review comments for v7-0001.
>
> (I am late to this thread. If some of my comments have already been
> discussed and rejected please let me know).
>
> ======
> 1 GENERAL. Option Name?
>
> Wondering why the patch is introducing more terminology like
> "cleanup"; if we are dropping publications then why not say "drop"?
> Also I am not sure if "existing" means anything because you cannot
> cleanup/drop something that is not "existing".
>
> IOW, why not call this the "--drop-publications" option?
>


I have retained the option name as '--cleanup-existing-publications'
for now. However, I understand the concern regarding terminology and
will revise it in the next patch version once there is a consensus on
the final name.

> ======
> Commit message
>
> 2.
> These publications, replicated during streaming replication, become redundant
> after converting to logical replication and serve no further purpose.
>
> ~
>
> From this description it seems there is an assumption that the only
> publications on the target server are those that were physically
> replicated to the standby. Is that strictly true? Isn't it also
> possible that a user might have created their own publication on the
> target server prior to running the pg_createsubscriber. So even if
> they want all the physically replicated ones to be removed, they would
> NOT want their own new publication to also get removed at the same
> time.
>
> E.g. The original motivation thread [1] for this patch only said "But
> what good is having the SAME publications as primary also on logical
> replica?" (my emphasis of "same") so it seems there should be some
> sort of name matching before just dropping everything.
>
> Actually.... The code looks like it might be doing the correct thing
> already and only fetching the publication names from the source server
> and then deleting only those names from the target server. But this
> comment message didn't describe this clearly.
>
> ======

Modified the commit message.

> doc/src/sgml/ref/pg_createsubscriber.sgml
>
> 3.
> +      <para>
> +       Remove all existing publications from specified databases on tne 
> target
> +       server.
> +      </para>
>
> typo "tne"
>
> ======

Fixed.

> src/bin/pg_basebackup/pg_createsubscriber.c
>
> struct CreateSubscriberOptions:
>
> 4.
> + bool cleanup_existing_publications; /* drop all publications */
>
> The field name seems overkill. e.g. As mentioned for the option, it
> could be called 'drop' instead of cleanup. And the 'existing' seems
> superfluous because you can only drop something that exists. So why
> not just 'drop_publications'. Won't that have the same meaning?
>
> ~~~
>

Fixed.

> 5.
>  static void
> -setup_subscriber(struct LogicalRepInfo *dbinfo, const char *consistent_lsn)
> +setup_subscriber(struct LogicalRepInfo *dbinfo, const char *consistent_lsn,
> + bool cleanup_existing_publications)
>
> The setup_subscriber's function comment does not say anything about
> this function potentially also dropping publications at the
> subscriber.
>
> ~~~
>

Fixed.

> 6.
>  static void
> -drop_publication(PGconn *conn, struct LogicalRepInfo *dbinfo)
> +drop_publication(PGconn *conn, struct LogicalRepInfo *dbinfo,
> + bool cleanup_existing_publications)
>
> 6a.
> This arrangement seems complicated to me.
>
> IMO it would be more natural to have 2 separate functions, and just
> call the appropriate one.
> drop_publication()
> drop_all_publications()
>
> instead of trying to make this function do everything.
>
> ~
>
> 6b.
> Furthermore, can't you just have a common helper function to DROP a
> single PUBLICATION by name?
>
> Then the code that drops all publications can just loop to call this
> common dropper for each iteration. Code should be much simpler. I
> don't see the efficiency of this operation is really a factor,
> pg_createsubscriber is rarely used, so IMO a better goal is code
> simplicity/maintenance.
>
> e.g. drop_publication() --> _drop_one_publication()
> e.g  drop_all_publications() --> LOOP (pub list) {  _drop_one_publication() }
>
> ======

Fixed.

> .../t/040_pg_createsubscriber.pl
>
> 7.
> +
> +# Create publications to test it's removal
> +$node_p->safe_psql($db1, "CREATE PUBLICATION test_pub FOR ALL TABLES;");
> +$node_p->safe_psql($db1, "CREATE PUBLICATION test_pub2 FOR ALL TABLES;");
> +
>
> /it's/their/
>
> ~~~
>

Fixed.

> 8.
> Maybe also do a CREATE PUBLICATION at node_s, prior to the
> pg_createsubvscript, so then you can verify that the user-created one
> is unaffected by the cleanup of all the others.
>
> ======

Since $node_s is a streaming standby, it does not allow object
creation. As a result, publications cannot be created on $node_s.

> [1] 
> https://www.postgresql.org/message-id/CAExHW5t4ew7ZrgcDdTv7YmuG7LVQT1ZaEny_EvtngHtEBNyjcQ%40mail.gmail.com
>

The attached patch contains the suggested changes.

Thanks and regards,
Shubham Khanna.

v8-0001-Support-for-dropping-all-publications-in-pg_creat.patch
Description: Binary data

Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.

Reply via email to