Buildfarm runs have triggered the assertion at the end of SyncRepGetSyncStandbysPriority():
sysname │ snapshot │ branch │ bfurl ──────────┼─────────────────────┼───────────────┼────────────────────────────────────────────────────────────────────────────────────────────── hoverfly │ 2019-11-22 12:15:08 │ HEAD │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hoverfly&dt=2019-11-22%2012%3A15%3A08 hoverfly │ 2019-11-07 17:19:12 │ REL9_6_STABLE │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hoverfly&dt=2019-11-07%2017%3A19%3A12 nightjar │ 2019-08-13 23:04:41 │ REL_10_STABLE │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=nightjar&dt=2019-08-13%2023%3A04%3A41 skink │ 2018-11-28 21:03:35 │ HEAD │ http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2018-11-28%2021%3A03%3A35 On my development system, this delay injection reproduces the failure: --- a/src/backend/replication/syncrep.c +++ b/src/backend/replication/syncrep.c @@ -399,6 +399,8 @@ SyncRepInitConfig(void) { int priority; + pg_usleep(100 * 1000); SyncRepInitConfig() is the function responsible for updating, after SIGHUP, the sync_standby_priority values that SyncRepGetSyncStandbysPriority() consults. The assertion holds if each walsender's sync_standby_priority (in shared memory) accounts for the latest synchronous_standby_names GUC value. That ceases to hold for brief moments after a SIGHUP that changes the synchronous_standby_names GUC value. I think the way to fix this is to nominate one process to update all sync_standby_priority values after SIGHUP. That process should acquire SyncRepLock once per ProcessConfigFile(), not once per walsender. If walsender startup occurs at roughly the same time as a SIGHUP, the new walsender should avoid computing sync_standby_priority based on a GUC value different from the one used for the older walsenders. Would anyone like to fix this? I could add it to my queue, but it would wait a year or more. Thanks, nm