Dear Amit, Michael, > > I am not sure to get the reason why get_old_cluster_logical_slot_infos() > > could not be optimized, TBH. LogicalReplicationSlotHasPendingWal() > > uses the fast forward mode where no changes are generated, hence there > > should be no need for a dependency to a connection to a specific > > database :) > > > > Combined to a hash table based on the database name and/or OID to know > > to which dbinfo to attach the information of a slot, then it should be > > possible to use one query, making the slot info gathering closer to > > O(N) rather than the current O(N^2). > > > > The point is that unlike subscriptions logical slots are not > cluster-level objects. So, this needs more careful design decisions > rather than a fix-up patch for PG-17. One more thing after collecting > slot-level, we also want to consider the creation of slots which again > are created at per-database level.
I also considered the combination with the optimization (parallelization) of pg_upgrade [1]. IIUC, the patch tries to connect to some databases in parallel and run commands. The current style of create_logical_replication_slots() can be easily adapted because tasks are divided per database. However, if we change like get_old_cluster_logical_slot_infos() to do in a single pass, we may have to shift LogicalSlotInfoArr to cluster-wide data and store the database name in LogicalSlotInfo. Also, in create_logical_replication_slots(), we may have to check the located database for every slot and connect to the appropriate database. These changes make it difficult to parallelize the operation. [1]: https://www.postgresql.org/message-id/flat/20240516211638.GA1688936@nathanxps13 Best regards, Hayato Kuroda FUJITSU LIMITED