Dear Dilip, Thank you for reviewing! PSA new version.
> > 1. > Note that slot restoration must be done after the final pg_resetwal command > during the upgrade because pg_resetwal will remove WALs that are required by > the slots. Due to this restriction, the timing of restoring replication slots > is > different from other objects. > > This comment in the commit message is confusing. I understand the > reason but from this, it is not very clear that if resetwal removes > the WAL we needed then why it is good to create after the resetwal. I > think we should make it clear that creating new slot will set the > restart lsn to current WAL location and after that resetwal can remove > those WAL where slot restart lsn is pointing.... Just to confirm - WAL records must not be removed in any time if it is referred as restart_lsn. The reason why the slot creation is done after pg_restwal is that required WALs are not removed by the command. See [1]. Moreover, clarified more in the commit message. > 2. > > + <itemizedlist> > + <listitem> > + <para> > + All slots on the old cluster must be usable, i.e., there are no slots > + whose > + <link > linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield> > wal_status</structfield> > + is <literal>lost</literal>. > + </para> > + </listitem> > + <listitem> > + <para> > + <link > linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>c > onfirmed_flush_lsn</structfield> > + of all slots on the old cluster must be the same as the latest > + checkpoint location. > + </para> > + </listitem> > + <listitem> > + <para> > + The output plugins referenced by the slots on the old cluster must be > + installed in the new PostgreSQL executable directory. > + </para> > + </listitem> > + <listitem> > + <para> > + The new cluster must have > + <link > linkend="guc-max-replication-slots"><varname>max_replication_slots</varna > me></link> > + configured to a value greater than or equal to the number of slots > + present in the old cluster. > + </para> > + </listitem> > + <listitem> > + <para> > + The new cluster must have > + <link > linkend="guc-wal-level"><varname>wal_level</varname></link> as > + <literal>logical</literal>. > + </para> > + </listitem> > + </itemizedlist> > > I think we should also add that the new slot should not have any > permanent existing logical replication slot. Hmm, I wondered it should be really needed. Tables are required not to be in the new cluster too, but not documented. It might be a trivial thing. Anyway, added. FYI - the restriction was not introduced by the patch. I reported independently [2], but no one has responded since now... > 3. > - with the primary.) Replication slots are not copied and must > - be recreated. > + with the primary.) Replication slots on the old standby are not > copied. > + Only logical slots on the primary are migrated to the new standby, > + and other slots must be recreated. > > This paragraph should be rephrased. I mean first stating that > "Replication slots on the old standby are not copied" and then saying > Only logical slots are migrated doesn't seem like the best way. Maybe > we can just say "Only logical slots on the primary are migrated to the > new standby, and other slots must be recreated." Per discussion on [3], I used another words. Thanks for suggesting. > 4. > + /* > + * Raise an ERROR if the logical replication slot is invalidating. It > + * would not happen because max_slot_wal_keep_size is set to -1 during > + * the upgrade, but it stays safe. > + */ > + if (*invalidated && SlotIsLogical(s) && IsBinaryUpgrade) > + elog(ERROR, "Replication slots must not be invalidated during the > upgrade."); > > Rephrase the first line as -> Raise an ERROR if the logical > replication slot is invalidating during an upgrade. Per discussion on [3], I used another words. Thanks for suggesting. > 5. > + /* Logical slots can be migrated since PG17. */ > + if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600) > + return; > > > For readability change this to if > (GET_MAJOR_VERSION(old_cluster.major_version) < 1700), because in most > of the checks related to this, we are using 1700 so better be > consistent in this. Per discussion on [3], I did not change here. > 6. > + if (nslots_on_new) > + pg_fatal(ngettext("New cluster must not have logical replication > slots but found %d slot.", > + "New cluster must not have logical replication slots but found %d slots.", > + nslots_on_new), > + nslots_on_new); > ... > + if (PQntuples(res) != 1) > + pg_fatal("could not determine wal_level."); > + > + wal_level = PQgetvalue(res, 0, 0); > + > + if (strcmp(wal_level, "logical") != 0) > + pg_fatal("wal_level must be \"logical\", but is set to \"%s\"", > + wal_level); > > > I have noticed that the case of the first letter in the pg_fatal > message is not consistent. Actually there are some inconsistency even in the check.c file, so I devised below rules. How do you think? * Non-complete sentence starts with the lower case. (e.g., "could not open", "could not determine") * proper nouns are always noted with the lower cases (e.g., "template0 must not allow...", "wal_level must be..."). * Other than above, the sentence starts with the upper case. > 7. > + > + /* Is the slot still usable? */ > + if (slot->invalid) > + { > > Why comment says "Is the slot still usable?" I think it should be "Is > the slot usable?" otherwise it appears that we have first fetched the > slots and now we are refetching it and checking whether it is still > usable. Changed. [1]: https://www.postgresql.org/message-id/TYAPR01MB58664C81887B3AF2EB6B16E3F5939%40TYAPR01MB5866.jpnprd01.prod.outlook.com [2]: https://www.postgresql.org/message-id/tyapr01mb5866d277f6bedea4223b3559f5...@tyapr01mb5866.jpnprd01.prod.outlook.com [3]: https://www.postgresql.org/message-id/CAFiTN-vs53SqZiZN1GcSuKLmMY%3D0d14wJDDm1aKmoBONwnqaGg%40mail.gmail.com Best Regards, Hayato Kuroda FUJITSU LIMITED
v35-0001-Persist-logical-slots-to-disk-during-a-shutdown-.patch
Description: v35-0001-Persist-logical-slots-to-disk-during-a-shutdown-.patch
v35-0002-pg_upgrade-Allow-to-replicate-logical-replicatio.patch
Description: v35-0002-pg_upgrade-Allow-to-replicate-logical-replicatio.patch