On Sat, Dec 19, 2020 at 12:10 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Fri, Dec 18, 2020 at 6:41 PM Peter Smith <smithpb2...@gmail.com> wrote: > > > > I understand why you are trying to create this patch atop logical > decoding of 2PC patch but I think it is better to create this as an > independent patch and then use it to test 2PC problem. Also, please > explain what kind of testing you did to ensure that it works properly > after the table sync worker restarts after the crash. >
Few other comments: ================== 1. * FIXME 3 - Crashed tablesync workers may also have remaining slots because I don't think + * such workers are even iterated by this loop, and nobody else is removing them. + */ + if (slotname) + { The above FIXME is not clear to me. Actually, the crashed workers should restart, finish their work, and drop the slots. So not sure what exactly this FIXME refers to? 2. DropSubscription() { .. ReplicationSlotDropAtPubNode( + NULL, + conninfo, /* use conninfo to make a new connection. */ + subname, + syncslotname); .. } With the above call, it will form a connection with the publisher and drop the required slots. I think we need to save the connection info so that we don't need to connect/disconnect for each slot to be dropped. Later in this function, we again connect and drop the apply worker slot. I think we should connect just once drop the apply and table sync slots if any. 3. ReplicationSlotDropAtPubNode(WalReceiverConn *wrconn_given, char *conninfo, char *subname, char *slotname) { .. + PG_TRY(); .. + PG_CATCH(); + { + /* NOP. Just gobble any ERROR. */ + } + PG_END_TRY(); Why are we suppressing the error instead of handling it the error in the same way as we do while dropping the apply worker slot in DropSubscription? 4. @@ -139,6 +141,28 @@ finish_sync_worker(void) get_rel_name(MyLogicalRepWorker->relid)))); CommitTransactionCommand(); + /* + * Cleanup the tablesync slot. + */ + { + extern void ReplicationSlotDropAtPubNode( + WalReceiverConn *wrconn_given, char *conninfo, char *subname, char *slotname); This is not how we export functions at other places? -- With Regards, Amit Kapila.