Hi, folks;

I’m in the process of testing PG v12.7 hot streaming standby’s in async mode, 
and specifically in the area of reliable switchover and switchback operations.

I have created test primary’s and standby’s successfully with pg_ctl initdb and 
pg_basebackup respectively. I can switch over/switch roles (standby -> primary, 
primary -> standby) successfully one time. When I switch roles a second time, 
the standby becomes primary just fine but the primary to standby fails to open 
properly. Logfile messages for both primary and standby are attached as well as 
pg_wal and archive directory lists.

I shut down the primary first and then promote the standby. The only action I 
take on the standby side is to create the standby.signal file in $PGDATA. (Well 
almost. Initially, I do have to have an empty 00000000.history file in the 
archive locations or the standby’s fail to start.)

My archive directories are separate from each other and the archive command is 
the given test … cp -p example in the docs. archive_mode is ‘on’ on both sides.

Are there other undocumented manual steps I should be completing before 
starting the new standby?

I would appreciate review of the attached logs to help pinpoint the issue. 
Inst1 is the primary/standby/primary, and inst2 is the standby/primary/standby.

Many thanks!
David
2022-05-18 09:52:52.820 EDT [55785] LOG:  starting PostgreSQL 12.7 on 
x86_64-pc-cygwin, compiled by gcc (GCC) 10.2.0, 64-bit
2022-05-18 09:52:52.833 EDT [55785] LOG:  listening on IPv6 address "::1", port 
5432
2022-05-18 09:52:52.833 EDT [55785] LOG:  listening on IPv4 address 
"127.0.0.1", port 5432
2022-05-18 09:52:52.840 EDT [55785] LOG:  listening on Unix socket 
"/tmp/.s.PGSQL.5432"
2022-05-18 09:52:53.088 EDT [55786] LOG:  database system was shut down at 
2022-05-18 09:52:47 EDT
2022-05-18 09:52:54.543 EDT [55785] LOG:  database system is ready to accept 
connections
2022-05-18 09:56:52.618 EDT [55785] LOG:  received fast shutdown request
2022-05-18 09:56:52.623 EDT [55785] LOG:  aborting any active transactions
2022-05-18 09:56:52.639 EDT [55785] LOG:  background worker "logical 
replication launcher" (PID 55793) exited with exit code 1
2022-05-18 09:56:52.656 EDT [55787] LOG:  shutting down
2022-05-18 09:56:52.841 EDT [55785] LOG:  database system is shut down
2022-05-18 09:57:35.741 EDT [55861] LOG:  starting PostgreSQL 12.7 on 
x86_64-pc-cygwin, compiled by gcc (GCC) 10.2.0, 64-bit
2022-05-18 09:57:35.756 EDT [55861] LOG:  listening on IPv6 address "::1", port 
5432
2022-05-18 09:57:35.756 EDT [55861] LOG:  listening on IPv4 address 
"127.0.0.1", port 5432
2022-05-18 09:57:35.764 EDT [55861] LOG:  listening on Unix socket 
"/tmp/.s.PGSQL.5432"
2022-05-18 09:57:35.980 EDT [55862] LOG:  database system was shut down at 
2022-05-18 09:56:52 EDT
cp: cannot stat '/home/itman/pgdata/archive1/00000002.history': No such file or 
directory
2022-05-18 09:57:36.093 EDT [55862] LOG:  entering standby mode
cp: cannot stat '/home/itman/pgdata/archive1/000000010000000000000004': No such 
file or directory
2022-05-18 09:57:36.232 EDT [55862] LOG:  consistent recovery state reached at 
0/40000A0
2022-05-18 09:57:36.232 EDT [55862] LOG:  invalid record length at 0/40000A0: 
wanted 24, got 0
2022-05-18 09:57:36.859 EDT [55861] LOG:  database system is ready to accept 
read only connections
2022-05-18 09:57:37.298 EDT [55868] LOG:  started streaming WAL from primary at 
0/4000000 on timeline 1
2022-05-18 09:57:37.301 EDT [55862] LOG:  redo starts at 0/40000A0
2022-05-18 09:59:14.673 EDT [55868] LOG:  replication terminated by primary 
server
2022-05-18 09:59:14.673 EDT [55868] DETAIL:  End of WAL reached on timeline 1 
at 0/50000A0.
2022-05-18 09:59:14.673 EDT [55868] FATAL:  could not send end-of-streaming 
message to primary: server closed the connection unexpectedly
                This probably means the server terminated abnormally
                before or while processing the request.
        no COPY in progress
cp: cannot stat '/home/itman/pgdata/archive1/00000002.history': No such file or 
directory
cp: cannot stat '/home/itman/pgdata/archive1/000000010000000000000005': No such 
file or directory
2022-05-18 09:59:14.902 EDT [55862] LOG:  invalid record length at 0/50000A0: 
wanted 24, got 0
2022-05-18 09:59:17.107 EDT [55881] FATAL:  could not connect to the primary 
server: connection to server at "127.0.0.1", port 5433 failed: Connection 
refused
                Is the server running on that host and accepting TCP/IP 
connections?
cp: cannot stat '/home/itman/pgdata/archive1/00000002.history': No such file or 
directory
cp: cannot stat '/home/itman/pgdata/archive1/000000010000000000000005': No such 
file or directory
2022-05-18 09:59:17.396 EDT [55862] LOG:  received promote request
2022-05-18 09:59:17.396 EDT [55862] LOG:  redo done at 0/5000028
2022-05-18 09:59:17.396 EDT [55862] LOG:  last completed transaction was at log 
time 2022-05-18 09:58:23.71503-04
cp: cannot stat '/home/itman/pgdata/archive1/000000010000000000000005': No such 
file or directory
cp: cannot stat '/home/itman/pgdata/archive1/00000002.history': No such file or 
directory
2022-05-18 09:59:17.659 EDT [55862] LOG:  selected new timeline ID: 2
2022-05-18 09:59:17.701 EDT [55862] LOG:  archive recovery complete
cp: cannot stat '/home/itman/pgdata/archive1/00000001.history': No such file or 
directory
2022-05-18 09:59:18.765 EDT [55861] LOG:  database system is ready to accept 
connections
2022-05-18 09:53:35.044 EDT [55811] LOG:  starting PostgreSQL 12.7 on 
x86_64-pc-cygwin, compiled by gcc (GCC) 10.2.0, 64-bit
2022-05-18 09:53:35.058 EDT [55811] LOG:  listening on IPv6 address "::1", port 
5433
2022-05-18 09:53:35.058 EDT [55811] LOG:  listening on IPv4 address 
"127.0.0.1", port 5433
2022-05-18 09:53:35.066 EDT [55811] LOG:  listening on Unix socket 
"/tmp/.s.PGSQL.5433"
2022-05-18 09:53:35.264 EDT [55812] LOG:  database system was interrupted; last 
known up at 2022-05-18 09:53:18 EDT
cp: cannot stat '/home/itman/pgdata/archive2/00000000.history': No such file or 
directory
2022-05-18 09:53:36.007 EDT [55812] FATAL:  recovery target timeline 0 does not 
exist
2022-05-18 09:53:36.023 EDT [55811] LOG:  startup process (PID 55812) exited 
with exit code 1
2022-05-18 09:53:36.023 EDT [55811] LOG:  aborting startup due to startup 
process failure
2022-05-18 09:53:36.032 EDT [55811] LOG:  database system is shut down
2022-05-18 09:54:49.267 EDT [55821] LOG:  starting PostgreSQL 12.7 on 
x86_64-pc-cygwin, compiled by gcc (GCC) 10.2.0, 64-bit
2022-05-18 09:54:49.282 EDT [55821] LOG:  listening on IPv6 address "::1", port 
5433
2022-05-18 09:54:49.282 EDT [55821] LOG:  listening on IPv4 address 
"127.0.0.1", port 5433
2022-05-18 09:54:49.289 EDT [55821] LOG:  listening on Unix socket 
"/tmp/.s.PGSQL.5433"
2022-05-18 09:54:49.551 EDT [55822] LOG:  database system was interrupted; last 
known up at 2022-05-18 09:53:18 EDT
2022-05-18 09:54:50.281 EDT [55822] LOG:  restored log file "00000000.history" 
from archive
2022-05-18 09:54:50.281 EDT [55822] LOG:  entering standby mode
2022-05-18 09:54:50.408 EDT [55822] LOG:  restored log file "00000000.history" 
from archive
cp: cannot stat '/home/itman/pgdata/archive2/000000000000000000000002': No such 
file or directory
2022-05-18 09:54:50.940 EDT [55826] LOG:  started streaming WAL from primary at 
0/2000000 on timeline 1
2022-05-18 09:54:50.982 EDT [55822] LOG:  redo starts at 0/2000028
2022-05-18 09:54:50.983 EDT [55822] LOG:  consistent recovery state reached at 
0/2000100
2022-05-18 09:54:51.697 EDT [55821] LOG:  database system is ready to accept 
read only connections
2022-05-18 09:56:52.819 EDT [55826] LOG:  replication terminated by primary 
server
2022-05-18 09:56:52.819 EDT [55826] DETAIL:  End of WAL reached on timeline 1 
at 0/40000A0.
2022-05-18 09:56:52.819 EDT [55826] FATAL:  could not send end-of-streaming 
message to primary: server closed the connection unexpectedly
                This probably means the server terminated abnormally
                before or while processing the request.
        no COPY in progress
cp: cannot stat '/home/itman/pgdata/archive2/000000010000000000000004': No such 
file or directory
2022-05-18 09:56:52.920 EDT [55822] LOG:  invalid record length at 0/40000A0: 
wanted 24, got 0
2022-05-18 09:56:54.461 EDT [55822] LOG:  received promote request
2022-05-18 09:56:54.462 EDT [55846] FATAL:  terminating walreceiver process due 
to administrator command
cp: cannot stat '/home/itman/pgdata/archive2/000000010000000000000004': No such 
file or directory
2022-05-18 09:56:54.602 EDT [55822] LOG:  redo done at 0/4000028
2022-05-18 09:56:54.602 EDT [55822] LOG:  last completed transaction was at log 
time 2022-05-18 09:55:26.332136-04
cp: cannot stat '/home/itman/pgdata/archive2/000000010000000000000004': No such 
file or directory
2022-05-18 09:56:54.745 EDT [55822] LOG:  selected new timeline ID: 1
2022-05-18 09:56:54.800 EDT [55822] LOG:  archive recovery complete
2022-05-18 09:56:54.933 EDT [55822] LOG:  restored log file "00000000.history" 
from archive
2022-05-18 09:56:55.910 EDT [55821] LOG:  database system is ready to accept 
connections
2022-05-18 09:59:14.514 EDT [55821] LOG:  received fast shutdown request
2022-05-18 09:59:14.519 EDT [55821] LOG:  aborting any active transactions
2022-05-18 09:59:14.535 EDT [55821] LOG:  background worker "logical 
replication launcher" (PID 55856) exited with exit code 1
2022-05-18 09:59:14.536 EDT [55828] LOG:  shutting down
2022-05-18 09:59:14.701 EDT [55821] LOG:  database system is shut down
2022-05-18 10:00:31.541 EDT [55985] LOG:  starting PostgreSQL 12.7 on 
x86_64-pc-cygwin, compiled by gcc (GCC) 10.2.0, 64-bit
2022-05-18 10:00:31.552 EDT [55985] LOG:  listening on IPv6 address "::1", port 
5433
2022-05-18 10:00:31.552 EDT [55985] LOG:  listening on IPv4 address 
"127.0.0.1", port 5433
2022-05-18 10:00:31.559 EDT [55985] LOG:  listening on Unix socket 
"/tmp/.s.PGSQL.5433"
2022-05-18 10:00:31.753 EDT [55986] LOG:  database system was shut down at 
2022-05-18 09:59:14 EDT
2022-05-18 10:00:31.872 EDT [55986] LOG:  restored log file "00000000.history" 
from archive
2022-05-18 10:00:31.872 EDT [55986] LOG:  entering standby mode
2022-05-18 10:00:32.001 EDT [55986] LOG:  restored log file "00000000.history" 
from archive
cp: cannot stat '/home/itman/pgdata/archive2/000000000000000000000005': No such 
file or directory
2022-05-18 10:00:32.550 EDT [55990] LOG:  fetching timeline history file for 
timeline 2 from primary server
2022-05-18 10:00:32.568 EDT [55990] LOG:  started streaming WAL from primary at 
0/5000000 on timeline 1
2022-05-18 10:00:32.568 EDT [55990] LOG:  replication terminated by primary 
server
2022-05-18 10:00:32.568 EDT [55990] DETAIL:  End of WAL reached on timeline 1 
at 0/50000A0.
2022-05-18 10:00:32.580 EDT [55986] LOG:  consistent recovery state reached at 
0/50000A0
cp: cannot stat '/home/itman/pgdata/archive2/000000010000000000000005': No such 
file or directory
2022-05-18 10:00:32.760 EDT [55986] LOG:  invalid record length at 0/50000A0: 
wanted 24, got 0
2022-05-18 10:00:32.761 EDT [55990] LOG:  restarted WAL streaming at 0/5000000 
on timeline 1
2022-05-18 10:00:32.762 EDT [55990] LOG:  replication terminated by primary 
server
2022-05-18 10:00:32.762 EDT [55990] DETAIL:  End of WAL reached on timeline 1 
at 0/50000A0.
2022-05-18 10:00:33.300 EDT [55985] LOG:  database system is ready to accept 
read only connections
cp: cannot stat '/home/itman/pgdata/archive2/000000010000000000000005': No such 
file or directory
2022-05-18 10:00:37.713 EDT [55990] LOG:  restarted WAL streaming at 0/5000000 
on timeline 1
2022-05-18 10:00:37.713 EDT [55990] LOG:  replication terminated by primary 
server
2022-05-18 10:00:37.713 EDT [55990] DETAIL:  End of WAL reached on timeline 1 
at 0/50000A0.
cp: cannot stat '/home/itman/pgdata/archive2/000000010000000000000005': No such 
file or directory
2022-05-18 10:00:42.739 EDT [55990] LOG:  restarted WAL streaming at 0/5000000 
on timeline 1
2022-05-18 10:00:42.739 EDT [55990] LOG:  replication terminated by primary 
server
2022-05-18 10:00:42.739 EDT [55990] DETAIL:  End of WAL reached on timeline 1 
at 0/50000A0.
2022-05-18 10:00:45.480 EDT [55985] LOG:  received fast shutdown request
2022-05-18 10:00:45.484 EDT [55985] LOG:  aborting any active transactions
2022-05-18 10:00:45.486 EDT [55990] FATAL:  terminating walreceiver process due 
to administrator command
2022-05-18 10:00:45.507 EDT [55993] LOG:  shutting down
2022-05-18 10:00:45.581 EDT [55985] LOG:  database system is shut down
total 98310
-rw-------  1 itman itman 16777216 May 18 09:53 000000010000000000000001
-rw-------  1 itman itman 16777216 May 18 09:53 000000010000000000000002
-rw-------  1 itman itman      337 May 18 09:53 
000000010000000000000002.00000028.backup
-rw-------  1 itman itman 16777216 May 18 09:56 000000010000000000000003
-rw-------  1 itman itman 16777216 May 18 09:59 000000010000000000000004
-rw-------  1 itman itman 16777216 May 18 09:59 000000010000000000000005.partial
-rw-------  1 itman itman       41 May 18 09:59 00000002.history
drwx------+ 1 itman itman        0 May 18 09:59 archive_status
-rw-------  1 itman itman 16777216 May 18 09:59 000000020000000000000005
total 81926
-rw------- 1 itman itman 16777216 May 18 09:54 000000010000000000000002
-rw------- 1 itman itman 16777216 May 18 09:56 000000010000000000000003
-rw------- 1 itman itman 16777216 May 18 09:56 000000010000000000000004.partial
-rw------- 1 itman itman       42 May 18 09:56 00000001.history
-rw------- 1 itman itman 16777216 May 18 09:59 000000010000000000000004
-rw------- 1 itman itman        0 May 18 10:00 00000000.history
-rw------- 1 itman itman       41 May 18 10:00 00000002.history
drwx------ 1 itman itman        0 May 18 10:00 archive_status
-rw------- 1 itman itman 16777216 May 18 10:00 000000010000000000000005
total 65538
-rw-r--r-- 1 itman itman        0 May 18 09:54 00000000.history
-rw------- 1 itman itman 16777216 May 18 09:53 000000010000000000000001
-rw------- 1 itman itman 16777216 May 18 09:53 000000010000000000000002
-rw------- 1 itman itman      337 May 18 09:53 
000000010000000000000002.00000028.backup
-rw------- 1 itman itman 16777216 May 18 09:56 000000010000000000000003
-rw------- 1 itman itman 16777216 May 18 09:59 000000010000000000000005.partial
-rw------- 1 itman itman       41 May 18 09:59 00000002.history
total 32769
-rw-r--r-- 1 itman itman        0 May 18 09:53 00000000.history
-rw------- 1 itman itman       42 May 18 09:56 00000001.history
-rw------- 1 itman itman 16777216 May 18 09:59 000000010000000000000004
-rw------- 1 itman itman 16777216 May 18 09:56 000000010000000000000004.partial

Reply via email to