Dear Nathan, Thank you for making the patch! I tested your patch, and it basically worked well. About following part:
``` ConfigReloadPending = false; ProcessConfigFile(PGC_SIGHUP); + now = GetCurrentTimestamp(); + for (int i = 0; i < NUM_LRW_WAKEUPS; i++) + LogRepWorkerComputeNextWakeup(i, now); + + /* + * If a wakeup time for starting sync workers was set, just set it + * to right now. It will be recalculated as needed. + */ + if (next_sync_start != PG_INT64_MAX) + next_sync_start = now; } ``` Do we have to recalculate the NextWakeup when subscriber receives SIGHUP signal? I think this may cause the unexpected change like following. Assuming that wal_receiver_timeout is 60s, and wal_sender_timeout on publisher is 0s (or the network between nodes is disconnected). And we send SIGHUP signal per 20s to subscriber's postmaster. Currently the last_recv_time is calcurated when the worker accepts messages, and the value is used for deciding to send a ping. The worker will exit if the walsender does not reply. But in your patch, the apply worker calcurates wakeup[LRW_WAKEUP_PING] and wakeup[LRW_WAKEUP_TERMINATE] again when it gets SIGHUP, so the worker never sends ping with requestReply = true, and never exits due to the timeout. My case seems to be crazy, but there may be another issues if it remains. Best Regards, Hayato Kuroda FUJITSU LIMITED