At Wed, 26 Jan 2022 18:45:53 -0500, Andrew Dunstan <and...@dunslane.net> wrote in > It's very unlikely any of this is your fault. In any case, intermittent > failures are very hard to nail down.
The primary starts at 2022-01-26 16:30:06.613 the accepted a connectin from the standby at 2022-01-26 16:30:09.911. P: 2022-01-26 16:30:06.613 UTC [74668:1] LOG: starting PostgreSQL 15devel on x86_64-w64-mingw32, compiled by gcc.exe (Rev2, Built by MSYS2 project) 10.3.0, 64-bit S: 2022-01-26 16:30:09.637 UTC [72728:1] LOG: starting PostgreSQL 15devel on x86_64-w64-mingw32, compiled by gcc.exe (Rev2, Built by MSYS2 project) 10.3.0, 64-bit P: 2022-01-26 16:30:09.911 UTC [74096:3] [unknown] LOG: replication connection authorized: user=pgrunner application_name=standby S: 2022-01-26 16:30:09.912 UTC [73932:1] LOG: started streaming WAL from primary at 0/3000000 on timeline 1 After this the primary restarts. P: 2022-01-26 16:30:11.832 UTC [74668:7] LOG: database system is shut down P: 2022-01-26 16:30:12.010 UTC [72140:1] LOG: starting PostgreSQL 15devel on x86_64-w64-mingw32, compiled by gcc.exe (Rev2, Built by MSYS2 project) 10.3.0, 64-bit But the standby doesn't notice the disconnection and continue with the poll_query_until() on the stale connection. But the query should have executed after reconnection finally. The following log lines are not thinned out. S: 2022-01-26 16:30:09.912 UTC [73932:1] LOG: started streaming WAL from primary at 0/3000000 on timeline 1 S: 2022-01-26 16:30:12.825 UTC [72760:1] [unknown] LOG: connection received: host=127.0.0.1 port=60769 S: 2022-01-26 16:30:12.830 UTC [72760:2] [unknown] LOG: connection authenticated: identity="EC2AMAZ-P7KGG90\\pgrunner" method=sspi (C:/tools/msys64/home/pgrunner/bf/root/HEAD/pgsql.build/src/test/modules/commit_ts/tmp_check/t_003_standby_2_standby_data/pgdata/pg_hba.conf:2) S: 2022-01-26 16:30:12.830 UTC [72760:3] [unknown] LOG: connection authorized: user=pgrunner database=postgres application_name=003_standby_2.pl S: 2022-01-26 16:30:12.838 UTC [72760:4] 003_standby_2.pl LOG: statement: SELECT '0/303C7D0'::pg_lsn <= pg_last_wal_replay_lsn() Since the standby dones't receive WAL from the restarted server so the poll_query_until() doesn't return. I'm not sure why walsender of the standby continued running not knowing the primary has been once dead for such a long time. regards. -- Kyotaro Horiguchi NTT Open Source Software Center