On Fri, Aug 22, 2025, at 6:57 AM, Zhijie Hou (Fujitsu) wrote:
> The documentation appears incorrect and needs revision. The latest version no
> longer depends on the option order; instead, it requires users to provide
> database-qualified table names, such as -t "db1.sch1.tb1". This adjustment
> allows the command to internally categorize tables by their target database.
>

I don't like this design. There is no tool that uses 3 elements. It is also
confusing and redundant to have the database in the --database option and also
in the --table option.

I'm wondering if we allow using a specified publication is a better UI. If you
specify --publication and it exists on primary, use it. The current behavior is
a failure if the publication exists. It changes the current behavior but I
don't expect someone relying on this failure to abort the execution. Moreover,
the error message was added to allow only FOR ALL TABLES; the proposal is to
relax this restriction.

> I think we can explore extending the existing --clean option in a separate 
> patch
> to support table cleanup. This option is implemented in a way that allows 
> adding
> further cleanup objects later, so it should be easy to extend it for table.
> Prior to this extension, it should be noted in the documentation that users 
> are
> required to clean up the tables themselves.
>

I would say that these cleanup feature (starting with the cleanup databases) is
equally important as the feature that selects specific objects.

> I agree that supporting row filter and column list is not straightforward, and
> we can consider it separately and do not implement that in the first version.
>

The proposal above would allow it with no additional lines of code.

>> 
>> It seems this proposal doesn't serve a general purpose. It is copying a 
>> *whole*
>> cluster to use only a subset of tables. Your task with pg_createsubscriber is
>> more expensive than doing a manual logical replication setup. If you have 500
>> tables and want to replicate only 400 tables, it doesn't seem productive to
>> specify 400 -t options.
>
> Specifying multiple -t options should not be problematic, as users has already
> done similar things for "FOR TABLE" publication DDLs. I think it's not hard
> for user to convert FOR TABLE list to -t option list.
>

Of course it is. Shell limits the number of arguments.

>> There are some cases like a small set of big tables that
>> this feature makes sense. However, I'm wondering if a post script should be
>> used to adjust your setup.
>
> I think it's not very convenient for users to perform this conversion 
> manually.
> I've learned in PGConf.dev this year that some users avoid using
> pg_createsubscriber because they are unsure of the standard steps required to
> convert it into subset table replication. Automating this process would be
> beneficial, enabling more users to use pg_createsubscriber and take advantage 
> of
> the rapid initial table synchronization.
>

You missed my point. I'm not talking about manually converting a physical
replica into a logical replica. I'm talking about the plain logical replication
setup (CREATE PUBLICATION, CREATE SUBSCRIPTION). IME this tool is beneficial
for large clusters that we want to replicate (almost) all tables.


-- 
Euler Taveira
EDB   https://www.enterprisedb.com/


Reply via email to