> On May 21, 2021, at 10:39 PM, Bharath Rupireddy
> <bharath.rupireddyforpostg...@gmail.com> wrote:
>
> On Sat, May 22, 2021 at 1:49 AM Mark Dilger
> <mark.dil...@enterprisedb.com> wrote:
>>
>> -hackers,
>>
>> I think commit 82ed7748b710e3ddce3f7ebc74af80fe4869492f created some
>> confusion that should be cleaned up before release. I'd like some guidance
>> on what the intended behavior is before I submit a patch for this, though:
>>
>> +ALTER SUBSCRIPTION mysubscription SET PUBLICATION nosuchpub WITH (copy_data
>> = false, refresh = false);
>> +ALTER SUBSCRIPTION mysubscription ADD PUBLICATION nosuchpub WITH (copy_data
>> = false, refresh = false);
>> +ALTER SUBSCRIPTION mysubscription DROP PUBLICATION nosuchpub WITH
>> (copy_data = false, refresh = false);
>> +ERROR: unrecognized subscription parameter: "copy_data"
>> +ALTER SUBSCRIPTION mysubscription SET (copy_data = false, refresh = false);
>> +ERROR: unrecognized subscription parameter: "copy_data"
>>
>> First, it's quite odd to say that "copy_data" is unrecognized in the third
>> and fourth ALTER commands when it was recognized just fine in the first two.
>
> For ALTER SUBSCRIPTION ... DROP PUBLICATION, copy_data option is not
> required actually, because it doesn't add new publications. If the
> concern here is "why refresh is allowed but not copy_data", then the
> answer is "with the refresh option the updated publications can be
> refreshed, this avoids users to run REFRESH PUBLICATION after DROP
> PUBLICATION". So, disallowing copy_data makes sense to me.
My concern isn't that the code is doing the wrong thing, but that the docs and
the error messages are confusing. This is particularly troubling given that
having a single action which combines the dropping of one publication with the
refreshing of other publications is not particularly intuitive.
I agree that disallowing copy_data DROP PUBLICATION is a reasonable design
choice, but I do not agree that this prohibition is intuitive. If I want to
copy the data for a set of tables on a remote server, and only copy that data
exactly once, I might be looking for an atomic action to do so. The docs are
totally unclear on whether this is supported, so I might try:
CREATE SUBSCRIPTION tempsub CONNECTION 'dbname=remotedb' PUBLICATION
remotepub WITH (connect = false, enabled = false, slot_name = NONE, create_slot
= false);
ALTER SUBSCRIPTION tempsub DROP PUBLICATION remotepub WITH (refresh = true,
copy_data = true);
with the intention that the data will be copied right before the publication is
dropped. When I get an error that says 'unrecognized subscription parameter:
"copy_data"', I'm likely to think I mistyped the parameter name, not that it is
disallowed in this setting. If I then decide to just drop the publication
(since my experiment didn't work) and try to do so using:
ALTER SUBSCRIPTION tempsub DROP PUBLICATION remotepub WITH (refresh = false,
copy_data = false);
I seem to be playing by the rules, since I am explicitly not requesting
"copy_data". That's what the "false" means. But again, the command complains
that "copy_data" is unrecognized. At this point, I go back to the docs and it
clearly says that "copy_data" is a supported parameter in this command. I'm
totally confused.
I think the docs should say that "copy_data" is not allowed for DROP
PUBLICATION. I think no error should occur for copy_data = false. For
copy_data = true, I think the error message should say that copy_data is
disallowed during a DROP PUBLICATION, rather than saying that the parameter is
unrecognized.
> For ALTER SUBSCRIPTION ... SET, allowed options are slot_name,
> synchronous_commit, binary and streaming which are part of
> pg_subscription catalog and will be used by apply/sync workers.
> Whereas copy_data and refresh are not part of pg_subscription catalog
> and are not used by apply/sync workers (directly), but by the backend.
> We have ALTER SUBSCRIPTION .. REFRESH specifically for refresh and
> copy_data options.
>
>> More than that, though, the docs in doc/src/sgml/ref/alter_subscription.sgml
>> refer to this part of the grammar in the first three ALTER commands as a
>> "set_publication_option", not as a "subscription_parameter", a term which is
>> only used in the grammar for other forms of the ALTER command. Per the
>> grammar in the docs, "copy_data" is not a valid set_publication_option, only
>> "refresh" is.
>
> set_publication_option - options are refresh and copy_data (this
> option comes implicitly, please see the note "Additionally, refresh
> options as described under REFRESH PUBLICATION may be specified.",
> under refresh_option we have copy_data)
>
> subscription_parameter - options are slot_name, synchronous_commit,
> binary, and streaming. This is correct.
>
>> Should the first three ALTER commands fail with an error about "copy_data"
>> being an invalid set_publication_option? Should they succeed, in which case
>> the docs should mention that "refresh" is not the only valid
>> set_publication_option?
>
> No that's not correct. As I said above, set_publication_option options
> are both refresh and copy_data.
Well, not really. We're using the phrase "set_publication_option" for all
three of SET PUBLICATION, ADD PUBLICATION, and DROP PUBLICATION. Since that's
not really supported, we should use it only for the first two, and have a
separate "drop_publication_option" for the third.
>> Something else, perhaps?
>
> Unless I misunderstood any of your concerns, I think the existing docs
> and the code looks correct to me.
Thanks for your response. The docs and error messages still don't look right
to me.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company