Re: A failure in 031_recovery_conflict.pl on Debian/s390x

Christoph Berg Wed, 09 Aug 2023 03:56:28 -0700

Re: Thomas Munro
> On Wed, Aug 9, 2023 at 2:01 AM Christoph Berg <m...@debian.org> wrote:
> > Putting that patch on top of v8 made it pass 294 times before exiting
> > like this:
> >
> > [08:52:34.134](0.032s) ok 1 - buffer pin conflict: cursor with conflicting 
> > pin established
> > Waiting for replication conn standby's replay_lsn to pass 0/3430000 on 
> > primary
> > done
> > timed out waiting for match: (?^:User was holding shared buffer pin for too 
> > long) at t/031_recovery_conflict.pl line 318.
> 
> Can you reproduce that with logging like this added on top?


603 iterations later it hit again, but didn't log anything. (I believe
I did run "make" in the right directory.)

[22:20:24.714](3.145s) # issuing query via background psql:
#     BEGIN;
#     DECLARE test_recovery_conflict_cursor CURSOR FOR SELECT b FROM 
test_recovery_conflict_table1;
#     FETCH FORWARD FROM test_recovery_conflict_cursor;
[22:20:24.745](0.031s) ok 1 - buffer pin conflict: cursor with conflicting pin 
established
Waiting for replication conn standby's replay_lsn to pass 0/3430000 on primary
done
timed out waiting for match: (?^:User was holding shared buffer pin for too 
long) at t/031_recovery_conflict.pl line 318.

Perhaps this can simply be attributed to the machine being too busy.

Christoph

2023-08-08 22:20:22.732 UTC [778626] LOG:  starting PostgreSQL 17devel (Debian 
17~~devel-1) on s390x-ibm-linux-gnu, compiled by gcc (Debian 13.2.0-1) 13.2.0, 
64-bit
2023-08-08 22:20:22.732 UTC [778626] LOG:  listening on Unix socket 
"/tmp/S8kif7vda0/.s.PGSQL.59145"
2023-08-08 22:20:22.733 UTC [778630] LOG:  database system was shut down at 
2023-08-08 22:20:22 UTC
2023-08-08 22:20:22.755 UTC [778626] LOG:  database system is ready to accept 
connections
2023-08-08 22:20:22.832 UTC [778639] 031_recovery_conflict.pl LOG:  statement: 
CREATE TABLESPACE test_recovery_conflict_tblspc LOCATION ''
2023-08-08 22:20:22.964 UTC [778644] 031_recovery_conflict.pl LOG:  received 
replication command: SHOW data_directory_mode
2023-08-08 22:20:22.964 UTC [778644] 031_recovery_conflict.pl STATEMENT:  SHOW 
data_directory_mode
2023-08-08 22:20:22.969 UTC [778644] 031_recovery_conflict.pl LOG:  received 
replication command: SHOW wal_segment_size
2023-08-08 22:20:22.969 UTC [778644] 031_recovery_conflict.pl STATEMENT:  SHOW 
wal_segment_size
2023-08-08 22:20:22.975 UTC [778644] 031_recovery_conflict.pl LOG:  received 
replication command: IDENTIFY_SYSTEM
2023-08-08 22:20:22.975 UTC [778644] 031_recovery_conflict.pl STATEMENT:  
IDENTIFY_SYSTEM
2023-08-08 22:20:22.991 UTC [778644] 031_recovery_conflict.pl LOG:  received 
replication command: BASE_BACKUP ( LABEL 'pg_basebackup base backup',  
PROGRESS,  CHECKPOINT 'fast',  WAIT 0,  MANIFEST 'yes',  TARGET 'client')
2023-08-08 22:20:22.991 UTC [778644] 031_recovery_conflict.pl STATEMENT:  
BASE_BACKUP ( LABEL 'pg_basebackup base backup',  PROGRESS,  CHECKPOINT 'fast', 
 WAIT 0,  MANIFEST 'yes',  TARGET 'client')
2023-08-08 22:20:22.998 UTC [778628] LOG:  checkpoint starting: immediate force 
wait
2023-08-08 22:20:23.006 UTC [778628] LOG:  checkpoint complete: wrote 7 buffers 
(5.5%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.001 s, sync=0.001 
s, total=0.008 s; sync files=0, longest=0.000 s, average=0.000 s; 
distance=11350 kB, estimate=11350 kB; lsn=0/2000060, redo lsn=0/2000028
2023-08-08 22:20:23.027 UTC [778654] 031_recovery_conflict.pl LOG:  received 
replication command: SHOW data_directory_mode
2023-08-08 22:20:23.027 UTC [778654] 031_recovery_conflict.pl STATEMENT:  SHOW 
data_directory_mode
2023-08-08 22:20:23.035 UTC [778654] 031_recovery_conflict.pl LOG:  received 
replication command: CREATE_REPLICATION_SLOT "pg_basebackup_778654" TEMPORARY 
PHYSICAL ( RESERVE_WAL)
2023-08-08 22:20:23.035 UTC [778654] 031_recovery_conflict.pl STATEMENT:  
CREATE_REPLICATION_SLOT "pg_basebackup_778654" TEMPORARY PHYSICAL ( RESERVE_WAL)
2023-08-08 22:20:23.073 UTC [778654] 031_recovery_conflict.pl LOG:  received 
replication command: IDENTIFY_SYSTEM
2023-08-08 22:20:23.073 UTC [778654] 031_recovery_conflict.pl STATEMENT:  
IDENTIFY_SYSTEM
2023-08-08 22:20:23.140 UTC [778654] 031_recovery_conflict.pl LOG:  received 
replication command: START_REPLICATION SLOT "pg_basebackup_778654" 0/2000000 
TIMELINE 1
2023-08-08 22:20:23.140 UTC [778654] 031_recovery_conflict.pl STATEMENT:  
START_REPLICATION SLOT "pg_basebackup_778654" 0/2000000 TIMELINE 1
2023-08-08 22:20:23.679 UTC [778644] 031_recovery_conflict.pl LOG:  temporary 
file: path "base/pgsql_tmp/pgsql_tmp778644.0", size 137324
2023-08-08 22:20:23.679 UTC [778644] 031_recovery_conflict.pl STATEMENT:  
BASE_BACKUP ( LABEL 'pg_basebackup base backup',  PROGRESS,  CHECKPOINT 'fast', 
 WAIT 0,  MANIFEST 'yes',  TARGET 'client')
2023-08-08 22:20:24.216 UTC [778709] standby LOG:  received replication 
command: IDENTIFY_SYSTEM
2023-08-08 22:20:24.216 UTC [778709] standby STATEMENT:  IDENTIFY_SYSTEM
2023-08-08 22:20:24.216 UTC [778709] standby LOG:  received replication 
command: START_REPLICATION 0/3000000 TIMELINE 1
2023-08-08 22:20:24.216 UTC [778709] standby STATEMENT:  START_REPLICATION 
0/3000000 TIMELINE 1
2023-08-08 22:20:24.377 UTC [778711] 031_recovery_conflict.pl LOG:  statement: 
CREATE DATABASE test_db
2023-08-08 22:20:24.427 UTC [778722] 031_recovery_conflict.pl LOG:  statement: 
CREATE TABLE test_recovery_conflict_table1(a int, b int);
2023-08-08 22:20:24.430 UTC [778722] 031_recovery_conflict.pl LOG:  statement: 
INSERT INTO test_recovery_conflict_table1 SELECT i % 3, 0 FROM 
generate_series(1,20) i;
2023-08-08 22:20:24.442 UTC [778722] 031_recovery_conflict.pl LOG:  statement: 
CREATE TABLE test_recovery_conflict_table2(a int, b int);
2023-08-08 22:20:24.469 UTC [778740] 031_recovery_conflict.pl LOG:  statement: 
SELECT pg_current_wal_flush_lsn()
2023-08-08 22:20:24.479 UTC [778743] 031_recovery_conflict.pl LOG:  statement: 
SELECT '0/342B100' <= replay_lsn AND state = 'streaming'
                 FROM pg_catalog.pg_stat_replication
                 WHERE application_name IN ('standby', 'walreceiver')
2023-08-08 22:20:24.675 UTC [778767] 031_recovery_conflict.pl LOG:  statement: 
BEGIN;
2023-08-08 22:20:24.675 UTC [778767] 031_recovery_conflict.pl LOG:  statement: 
INSERT INTO test_recovery_conflict_table1 VALUES (1,0);
2023-08-08 22:20:24.676 UTC [778767] 031_recovery_conflict.pl LOG:  statement: 
ROLLBACK;
2023-08-08 22:20:24.676 UTC [778767] 031_recovery_conflict.pl LOG:  statement: 
BEGIN;
2023-08-08 22:20:24.676 UTC [778767] 031_recovery_conflict.pl LOG:  statement: 
LOCK test_recovery_conflict_table1;
2023-08-08 22:20:24.676 UTC [778767] 031_recovery_conflict.pl LOG:  statement: 
COMMIT;
2023-08-08 22:20:24.692 UTC [778773] 031_recovery_conflict.pl LOG:  statement: 
SELECT pg_current_wal_flush_lsn()
2023-08-08 22:20:24.710 UTC [778783] 031_recovery_conflict.pl LOG:  statement: 
SELECT '0/342B1C0' <= replay_lsn AND state = 'streaming'
                 FROM pg_catalog.pg_stat_replication
                 WHERE application_name IN ('standby', 'walreceiver')
2023-08-08 22:20:25.051 UTC [778797] 031_recovery_conflict.pl LOG:  statement: 
VACUUM test_recovery_conflict_table1;
2023-08-08 22:20:25.142 UTC [778883] 031_recovery_conflict.pl LOG:  statement: 
SELECT pg_current_wal_flush_lsn()
2023-08-08 22:20:25.173 UTC [778886] 031_recovery_conflict.pl LOG:  statement: 
SELECT '0/3430000' <= replay_lsn AND state = 'streaming'
                 FROM pg_catalog.pg_stat_replication
                 WHERE application_name IN ('standby', 'walreceiver')
2023-08-08 22:20:25.344 UTC [778912] 031_recovery_conflict.pl LOG:  statement: 
SELECT '0/3430000' <= replay_lsn AND state = 'streaming'
                 FROM pg_catalog.pg_stat_replication
                 WHERE application_name IN ('standby', 'walreceiver')
2023-08-08 22:23:50.441 UTC [778626] LOG:  received immediate shutdown request
2023-08-08 22:23:50.456 UTC [796024] [unknown] FATAL:  the database system is 
shutting down
2023-08-08 22:23:50.955 UTC [796036] [unknown] FATAL:  the database system is 
shutting down
2023-08-08 22:23:50.988 UTC [778626] LOG:  database system is shut down

2023-08-08 22:20:24.198 UTC [778701] LOG:  starting PostgreSQL 17devel (Debian 
17~~devel-1) on s390x-ibm-linux-gnu, compiled by gcc (Debian 13.2.0-1) 13.2.0, 
64-bit
2023-08-08 22:20:24.198 UTC [778701] LOG:  listening on Unix socket 
"/tmp/S8kif7vda0/.s.PGSQL.59146"
2023-08-08 22:20:24.200 UTC [778707] LOG:  database system was interrupted; 
last known up at 2023-08-08 22:20:23 UTC
2023-08-08 22:20:24.200 UTC [778707] LOG:  entering standby mode
2023-08-08 22:20:24.201 UTC [778707] LOG:  redo starts at 0/2000028
2023-08-08 22:20:24.201 UTC [778707] WARNING:  unexpected directory entry 
"16384" found in pg_tblspc/
2023-08-08 22:20:24.201 UTC [778707] DETAIL:  All directory entries in 
pg_tblspc/ should be symbolic links.
2023-08-08 22:20:24.201 UTC [778707] HINT:  Remove those directories, or set 
allow_in_place_tablespaces to ON transiently to let recovery complete.
2023-08-08 22:20:24.201 UTC [778707] LOG:  consistent recovery state reached at 
0/2000100
2023-08-08 22:20:24.204 UTC [778701] LOG:  database system is ready to accept 
read-only connections
2023-08-08 22:20:24.216 UTC [778708] LOG:  started streaming WAL from primary 
at 0/3000000 on timeline 1
2023-08-08 22:20:24.715 UTC [778760] 031_recovery_conflict.pl LOG:  statement: 
BEGIN;
2023-08-08 22:20:24.720 UTC [778760] 031_recovery_conflict.pl LOG:  statement: 
DECLARE test_recovery_conflict_cursor CURSOR FOR SELECT b FROM 
test_recovery_conflict_table1;
2023-08-08 22:20:24.726 UTC [778760] 031_recovery_conflict.pl LOG:  statement: 
FETCH FORWARD FROM test_recovery_conflict_cursor;
2023-08-08 22:20:24.741 UTC [778760] 031_recovery_conflict.pl LOG:  statement: ;
2023-08-08 22:23:50.442 UTC [778708] FATAL:  could not receive data from WAL 
stream: server closed the connection unexpectedly
                This probably means the server terminated abnormally
                before or while processing the request.
2023-08-08 22:23:50.442 UTC [778707] LOG:  invalid record length at 0/347E6A8: 
expected at least 24, got 0
2023-08-08 22:23:50.456 UTC [796019] FATAL:  could not connect to the primary 
server: connection to server on socket "/tmp/S8kif7vda0/.s.PGSQL.59145" failed: 
FATAL:  the database system is shutting down
2023-08-08 22:23:50.490 UTC [778707] LOG:  waiting for WAL to become available 
at 0/347E6C0
2023-08-08 22:23:50.957 UTC [796029] FATAL:  could not connect to the primary 
server: connection to server on socket "/tmp/S8kif7vda0/.s.PGSQL.59145" failed: 
FATAL:  the database system is shutting down
2023-08-08 22:23:50.957 UTC [778707] LOG:  waiting for WAL to become available 
at 0/347E6C0
2023-08-08 22:23:51.095 UTC [778701] LOG:  received immediate shutdown request
2023-08-08 22:23:51.098 UTC [778701] LOG:  database system is shut down

# Checking port 59145
# Found port 59145
Name: primary
Data directory: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata
Backup directory: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/backup
Archive directory: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/archives
Connection string: port=59145 host=/tmp/S8kif7vda0
Log file: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/log/031_recovery_conflict_primary.log
# Running: initdb -D 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata
 -A trust -N
The files belonging to this database system will be owned by user "myon".
This user must also own the server process.

The database cluster will be initialized with this locale configuration:
  provider:    libc
  LC_COLLATE:  en_US.utf8
  LC_CTYPE:    en_US.utf8
  LC_MESSAGES: C
  LC_MONETARY: en_US.utf8
  LC_NUMERIC:  en_US.utf8
  LC_TIME:     en_US.utf8
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata
 ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Etc/UTC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok

Sync to disk skipped.
The data directory might become corrupt if the operating system crashes.

Success. You can now start the database server using:

    pg_ctl -D 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata
 -l logfile start

# Running: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/../../../src/test/regress/pg_regress
 --config-auth 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata
### Starting node "primary"
# Running: pg_ctl -w -D 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata
 -l 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/log/031_recovery_conflict_primary.log
 -o --cluster-name=primary start
waiting for server to start.... done
server started
# Postmaster PID for node "primary" is 778626
# Taking pg_basebackup my_backup from node "primary"
# Running: pg_basebackup -D 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/backup/my_backup
 -h /tmp/S8kif7vda0 -p 59145 --checkpoint fast --no-sync
# Backup finished
# Checking port 59146
# Found port 59146
Name: standby
Data directory: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_standby_data/pgdata
Backup directory: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_standby_data/backup
Archive directory: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_standby_data/archives
Connection string: port=59146 host=/tmp/S8kif7vda0
Log file: 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/log/031_recovery_conflict_standby.log
# Initializing node "standby" from backup "my_backup" of node "primary"
### Enabling streaming replication for node "standby"
### Starting node "standby"
# Running: pg_ctl -w -D 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_standby_data/pgdata
 -l 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/log/031_recovery_conflict_standby.log
 -o --cluster-name=standby start
waiting for server to start.... done
server started
# Postmaster PID for node "standby" is 778701
Waiting for replication conn standby's replay_lsn to pass 0/342B100 on primary
done
Waiting for replication conn standby's replay_lsn to pass 0/342B1C0 on primary
done
[22:20:24.714](3.145s) # issuing query via background psql: 
#     BEGIN;
#     DECLARE test_recovery_conflict_cursor CURSOR FOR SELECT b FROM 
test_recovery_conflict_table1;
#     FETCH FORWARD FROM test_recovery_conflict_cursor;
[22:20:24.745](0.031s) ok 1 - buffer pin conflict: cursor with conflicting pin 
established
Waiting for replication conn standby's replay_lsn to pass 0/3430000 on primary
done
timed out waiting for match: (?^:User was holding shared buffer pin for too 
long) at t/031_recovery_conflict.pl line 318.
# Postmaster PID for node "primary" is 778626
### Stopping node "primary" using mode immediate
# Running: pg_ctl -D 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata
 -m immediate stop
waiting for server to shut down.... done
server stopped
# No postmaster PID for node "primary"
# Postmaster PID for node "standby" is 778701
### Stopping node "standby" using mode immediate
# Running: pg_ctl -D 
/home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_standby_data/pgdata
 -m immediate stop
waiting for server to shut down.... done
server stopped
# No postmaster PID for node "standby"
[22:23:51.412](206.667s) # Tests were run but no plan was declared and 
done_testing() was not seen.
[22:23:51.413](0.000s) # Looks like your test exited with 255 just after 1.

Re: A failure in 031_recovery_conflict.pl on Debian/s390x

Reply via email to