Here are some review comments for patch v42-0001: ======
1. Commit message A later review comment below suggests some changes to the WARNING message so if those changes are made then the example in this commit message also needs to be modified. ====== 2. doc/src/sgml/ref/alter_subscription.sgml - copy_data Refer to the Notes for the usage of true for copy_data parameter and its interaction with the origin parameter. I think saying "usage of true" sounded a bit strange. SUGGESTION Refer to the Notes about how copy_data = true can interact with the origin parameter. ====== 3. doc/src/sgml/ref/create_subscription.sgml - copy data Refer to the Notes for the usage of true for copy_data parameter and its interaction with the origin parameter. SUGGESTION (same as #2) ~~ 4. doc/src/sgml/ref/create_subscription.sgml - origin Refer to the Notes for the usage of true for copy_data parameter and its interaction with the origin parameter. SUGGESTION (same as #2) ~~~ 5. doc/src/sgml/ref/create_subscription.sgml - Notes + <para> + If the subscription is created with <literal>origin = none</literal> and + <literal>copy_data = true</literal>, it will check if the publisher has + subscribed to the same table from other publishers and, if so, throw a + warning to notify user to check the publisher tables. The user can ensure "throw a warning to notify user" -> "log a warning to notify the user" ~~ 6. + warning to notify user to check the publisher tables. The user can ensure + that publisher tables do not have data which has an origin associated before + continuing with any other operations to prevent inconsistent data being + replicated. + </para> 6a. I'm not so sure about this. IMO the warning is not about the replication – really it is about the COPY which has already happened anyway. So the user can't really prevent anything from going wrong; instead, they have to take some action to clean up if anything did go wrong. SUGGESTION Before continuing with other operations the user should check that publisher tables did not have data with different origins, otherwise inconsistent data may have been copied. ~ 6b. I am also wondering what can the user do now. Assuming there was bad COPY then the subscriber table has already got unwanted stuff copied into it. Is there any advice we can give to help users fix this mess? ====== 7. src/backend/commands/subscriptioncmds.c - AlterSubscription_refresh + /* Check whether we can allow copy of newly added relations. */ + check_pub_table_subscribed(wrconn, sub->publications, copy_data, + sub->origin, subrel_local_oids, + subrel_count); "whether we can allow" seems not quite the right wording here anymore, because now there is no ERROR stopping this - so if there was unwanted data the COPY will proceed to copy it anyhow... ~~~ 8. src/backend/commands/subscriptioncmds.c - check_pub_table_subscribed @@ -1781,6 +1794,121 @@ AlterSubscriptionOwner_oid(Oid subid, Oid newOwnerId) table_close(rel, RowExclusiveLock); } +/* + * Check and throw a warning if the publisher has subscribed to the same table + * from some other publisher. This check is required only if "copy_data = true" + * and "origin = none" for CREATE SUBSCRIPTION and + * ALTER SUBSCRIPTION ... REFRESH statements to notify user that data having + * origin might have been copied. "throw a warning" -> "log a warning" "to notify user" -> "to notify the user" ? ~~~ 9. + if (copydata == false || !origin || + (pg_strcasecmp(origin, LOGICALREP_ORIGIN_NONE) != 0)) + return; SUGGESTION if (!copydata || !origin || (pg_strcasecmp(origin, LOGICALREP_ORIGIN_NONE) != 0)) ~~~ 10. (Question) I can't tell just by reading the code if FOR ALL TABLES case is handled – e.g. will this recognise the case were the publisher might have table data from other origins because it has a subscription on some other node that was publishing "FOR ALL TABLES"? ~~~ 11. + ereport(WARNING, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("publisher has subscribed table \"%s.%s\" from some other publisher", + nspname, relname), + errdetail("Publisher might have subscribed one or more tables from some other publisher."), + errhint("Verify that these publisher tables do not have data that has an origin associated before proceeding to avoid inconsistency.")); + break; 11a. I'm not sure having that "break" code is correct logic. Won't that mean the user will only get a warning for the first potential problem encountered but then other potential problems tables will have no warnings at all so the user may not be made aware of them? Perhaps you need to first gather the list of all the suspicious tables before logging a single warning which shows that list? Or perhaps you need to log multiple warnings – one warning per suspicious table? ~ 11b. The WARNING message seems a bit inside-out because the errmsg is giving more details than the errdetail. SUGGESTIONS (or something similar) errmsg - "subscription XXX requested origin=NONE but may have copied data that had a different origin." errdetail – Publisher YYY has subscribed table \"%s.%s\" from some other publisher" ~ 11c. The errhint sentence is unusual. BEFORE "Verify that these publisher tables do not have data that has an origin associated before proceeding to avoid inconsistency." SUGGESTION (or something similar) Before proceeding, verify that initial data copied from the publisher tables did not come from other origins. ====== 12. src/test/subscription/t/030_origin.pl +# have remotely originated data from node_A. We throw a warning, in this case, +# to draw attention to there being possible remote data. "throw a warning" -> "log a warning" (this occurs 2x) ------ Kind Regards, Peter Smith. Fujitsu Australia