Re: logical decoding and replication of sequences

Noah Misch Sat, 06 Aug 2022 17:36:49 -0700

On Thu, Apr 07, 2022 at 08:34:50PM +0200, Tomas Vondra wrote:
> I've pushed a revert af all the commits related to this - decoding of
> sequences and test_decoding / built-in replication changes.


Two July buildfarm runs failed with PANIC during standby promotion:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2022-07-19%2004%3A13%3A18
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2022-07-31%2011%3A33%3A13

The attached patch hacks things so an ordinary x86_64 GNU/Linux machine
reproduces this consistently.  "git bisect" then traced the regression to the
above revert commit (2c7ea57e56ca5f668c32d4266e0a3e45b455bef5).  The pg_ctl
test suite passes under this hack in all supported branches, and it passed on
v15 until that revert.  Would you investigate?

The buildfarm animal uses keep_error_builds.  From kept data directories, I
deduced these events:

- After the base backup, auto-analyze ran on the primary and wrote WAL.
- Standby streamed and wrote up to 0/301FFF.
- Standby received the promote signal.  Terminated streaming.  WAL page at 
0/302000 remained all-zeros.
- Somehow, end-of-recovery became a PANIC.

Key portions from buildfarm logs:

=== good run standby2 log
2022-07-21 22:55:16.860 UTC [25034912:5] LOG:  received promote request
2022-07-21 22:55:16.878 UTC [26804682:2] FATAL:  terminating walreceiver 
process due to administrator command
2022-07-21 22:55:16.878 UTC [25034912:6] LOG:  invalid record length at 
0/3000060: wanted 24, got 0
2022-07-21 22:55:16.878 UTC [25034912:7] LOG:  redo done at 0/3000028 system 
usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.42 s
2022-07-21 22:55:16.878 UTC [25034912:8] LOG:  selected new timeline ID: 2
2022-07-21 22:55:17.004 UTC [25034912:9] LOG:  archive recovery complete
2022-07-21 22:55:17.005 UTC [23724044:1] LOG:  checkpoint starting: force
2022-07-21 22:55:17.008 UTC [14549364:4] LOG:  database system is ready to 
accept connections
2022-07-21 22:55:17.093 UTC [23724044:2] LOG:  checkpoint complete: wrote 3 
buffers (2.3%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.019 s, 
sync=0.001 s, total=0.089 s; sync files=0, longest=0.000 s, average=0.000 s; 
distance=16384 kB, estimate=16384 kB
2022-07-21 22:55:17.143 UTC [27394418:1] [unknown] LOG:  connection received: 
host=[local]
2022-07-21 22:55:17.144 UTC [27394418:2] [unknown] LOG:  connection authorized: 
user=nm database=postgres application_name=003_promote.pl
2022-07-21 22:55:17.147 UTC [27394418:3] 003_promote.pl LOG:  statement: SELECT 
pg_is_in_recovery()
2022-07-21 22:55:17.148 UTC [27394418:4] 003_promote.pl LOG:  disconnection: 
session time: 0:00:00.005 user=nm database=postgres host=[local]
2022-07-21 22:55:58.301 UTC [14549364:5] LOG:  received immediate shutdown 
request
2022-07-21 22:55:58.337 UTC [14549364:6] LOG:  database system is shut down

=== failed run standby2 log, with my annotations
2022-07-19 05:28:22.136 UTC [7340406:5] LOG:  received promote request
2022-07-19 05:28:22.163 UTC [8519860:2] FATAL:  terminating walreceiver process 
due to administrator command
2022-07-19 05:28:22.166 UTC [7340406:6] LOG:  invalid magic number 0000 in log 
segment 000000010000000000000003, offset 131072
  New compared to the good run.  XLOG_PAGE_MAGIC didn't match.  This implies 
the WAL ended at a WAL page boundary.
2022-07-19 05:28:22.166 UTC [7340406:7] LOG:  redo done at 0/301F168 system 
usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.18 s
2022-07-19 05:28:22.166 UTC [7340406:8] LOG:  last completed transaction was at 
log time 2022-07-19 05:28:13.956716+00
  New compared to the good run.  The good run had no transactions to replay.  
The bad run replayed records from an auto-analyze.
2022-07-19 05:28:22.166 UTC [7340406:9] PANIC:  invalid record length at 
0/301F168: wanted 24, got 0
  More WAL overall in bad run, due to auto-analyze.  End of recovery wrongly 
considered a PANIC.
2022-07-19 05:28:22.583 UTC [8388800:4] LOG:  startup process (PID 7340406) was 
terminated by signal 6: IOT/Abort trap
2022-07-19 05:28:22.584 UTC [8388800:5] LOG:  terminating any other active 
server processes
2022-07-19 05:28:22.587 UTC [8388800:6] LOG:  shutting down due to startup 
process failure
2022-07-19 05:28:22.627 UTC [8388800:7] LOG:  database system is shut down

Let me know if I've left out details you want; I may be able to dig more out
of the buildfarm artifacts.

Author:     Noah Misch <n...@leadboat.com>
Commit:     Noah Misch <n...@leadboat.com>

    

diff --git a/src/backend/replication/walreceiver.c 
b/src/backend/replication/walreceiver.c
index 8604fd4..6232237 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -885,6 +885,12 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr, 
TimeLineID tli)
 {
        int                     startoff;
        int                     byteswritten;
+       bool            spin_after = false;
+       /* force writing to end at a certain WAL page boundary */
+       if (recptr + nbytes > 0x03022000) {
+               nbytes = 0x03022000 - recptr;
+               spin_after = true;
+       }
 
        Assert(tli != 0);
 
@@ -943,6 +949,8 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr, 
TimeLineID tli)
 
                LogstreamResult.Write = recptr;
        }
+       while (spin_after)  /* we forced writing to end; wait for termination */
+               ProcessWalRcvInterrupts();
 
        /* Update shared-memory status */
        pg_atomic_write_u64(&WalRcv->writtenUpto, LogstreamResult.Write);
diff --git a/src/bin/pg_ctl/t/003_promote.pl b/src/bin/pg_ctl/t/003_promote.pl
index 84d28f4..57ae317 100644
--- a/src/bin/pg_ctl/t/003_promote.pl
+++ b/src/bin/pg_ctl/t/003_promote.pl
@@ -29,6 +29,7 @@ command_fails_like(
        [ 'pg_ctl', '-D', $node_primary->data_dir, 'promote' ],
        qr/not in standby mode/,
        'pg_ctl promote of primary instance fails');
+if (1) { ok(1) for 1 .. 3; } else {  # skip tests this way to avoid merge 
conflicts
 
 my $node_standby = PostgreSQL::Test::Cluster->new('standby');
 $node_primary->backup('my_backup');
@@ -46,12 +47,17 @@ ok( $node_standby->poll_query_until(
                'postgres', 'SELECT NOT pg_is_in_recovery()'),
        'promoted standby is not in recovery');
 
+}
+my $node_standby;
+$node_primary->backup('my_backup');
 # same again with default wait option
 $node_standby = PostgreSQL::Test::Cluster->new('standby2');
 $node_standby->init_from_backup($node_primary, 'my_backup',
        has_streaming => 1);
 $node_standby->start;
 
+$node_primary->safe_psql('postgres', 'ANALYZE');
+
 is($node_standby->safe_psql('postgres', 'SELECT pg_is_in_recovery()'),
        't', 'standby is in recovery');

Re: logical decoding and replication of sequences

Reply via email to