Re: Thomas Munro > On Wed, Aug 9, 2023 at 2:01 AM Christoph Berg <m...@debian.org> wrote: > > Putting that patch on top of v8 made it pass 294 times before exiting > > like this: > > > > [08:52:34.134](0.032s) ok 1 - buffer pin conflict: cursor with conflicting > > pin established > > Waiting for replication conn standby's replay_lsn to pass 0/3430000 on > > primary > > done > > timed out waiting for match: (?^:User was holding shared buffer pin for too > > long) at t/031_recovery_conflict.pl line 318. > > Can you reproduce that with logging like this added on top?
603 iterations later it hit again, but didn't log anything. (I believe I did run "make" in the right directory.) [22:20:24.714](3.145s) # issuing query via background psql: # BEGIN; # DECLARE test_recovery_conflict_cursor CURSOR FOR SELECT b FROM test_recovery_conflict_table1; # FETCH FORWARD FROM test_recovery_conflict_cursor; [22:20:24.745](0.031s) ok 1 - buffer pin conflict: cursor with conflicting pin established Waiting for replication conn standby's replay_lsn to pass 0/3430000 on primary done timed out waiting for match: (?^:User was holding shared buffer pin for too long) at t/031_recovery_conflict.pl line 318. Perhaps this can simply be attributed to the machine being too busy. Christoph
2023-08-08 22:20:22.732 UTC [778626] LOG: starting PostgreSQL 17devel (Debian 17~~devel-1) on s390x-ibm-linux-gnu, compiled by gcc (Debian 13.2.0-1) 13.2.0, 64-bit 2023-08-08 22:20:22.732 UTC [778626] LOG: listening on Unix socket "/tmp/S8kif7vda0/.s.PGSQL.59145" 2023-08-08 22:20:22.733 UTC [778630] LOG: database system was shut down at 2023-08-08 22:20:22 UTC 2023-08-08 22:20:22.755 UTC [778626] LOG: database system is ready to accept connections 2023-08-08 22:20:22.832 UTC [778639] 031_recovery_conflict.pl LOG: statement: CREATE TABLESPACE test_recovery_conflict_tblspc LOCATION '' 2023-08-08 22:20:22.964 UTC [778644] 031_recovery_conflict.pl LOG: received replication command: SHOW data_directory_mode 2023-08-08 22:20:22.964 UTC [778644] 031_recovery_conflict.pl STATEMENT: SHOW data_directory_mode 2023-08-08 22:20:22.969 UTC [778644] 031_recovery_conflict.pl LOG: received replication command: SHOW wal_segment_size 2023-08-08 22:20:22.969 UTC [778644] 031_recovery_conflict.pl STATEMENT: SHOW wal_segment_size 2023-08-08 22:20:22.975 UTC [778644] 031_recovery_conflict.pl LOG: received replication command: IDENTIFY_SYSTEM 2023-08-08 22:20:22.975 UTC [778644] 031_recovery_conflict.pl STATEMENT: IDENTIFY_SYSTEM 2023-08-08 22:20:22.991 UTC [778644] 031_recovery_conflict.pl LOG: received replication command: BASE_BACKUP ( LABEL 'pg_basebackup base backup', PROGRESS, CHECKPOINT 'fast', WAIT 0, MANIFEST 'yes', TARGET 'client') 2023-08-08 22:20:22.991 UTC [778644] 031_recovery_conflict.pl STATEMENT: BASE_BACKUP ( LABEL 'pg_basebackup base backup', PROGRESS, CHECKPOINT 'fast', WAIT 0, MANIFEST 'yes', TARGET 'client') 2023-08-08 22:20:22.998 UTC [778628] LOG: checkpoint starting: immediate force wait 2023-08-08 22:20:23.006 UTC [778628] LOG: checkpoint complete: wrote 7 buffers (5.5%); 0 WAL file(s) added, 0 removed, 1 recycled; write=0.001 s, sync=0.001 s, total=0.008 s; sync files=0, longest=0.000 s, average=0.000 s; distance=11350 kB, estimate=11350 kB; lsn=0/2000060, redo lsn=0/2000028 2023-08-08 22:20:23.027 UTC [778654] 031_recovery_conflict.pl LOG: received replication command: SHOW data_directory_mode 2023-08-08 22:20:23.027 UTC [778654] 031_recovery_conflict.pl STATEMENT: SHOW data_directory_mode 2023-08-08 22:20:23.035 UTC [778654] 031_recovery_conflict.pl LOG: received replication command: CREATE_REPLICATION_SLOT "pg_basebackup_778654" TEMPORARY PHYSICAL ( RESERVE_WAL) 2023-08-08 22:20:23.035 UTC [778654] 031_recovery_conflict.pl STATEMENT: CREATE_REPLICATION_SLOT "pg_basebackup_778654" TEMPORARY PHYSICAL ( RESERVE_WAL) 2023-08-08 22:20:23.073 UTC [778654] 031_recovery_conflict.pl LOG: received replication command: IDENTIFY_SYSTEM 2023-08-08 22:20:23.073 UTC [778654] 031_recovery_conflict.pl STATEMENT: IDENTIFY_SYSTEM 2023-08-08 22:20:23.140 UTC [778654] 031_recovery_conflict.pl LOG: received replication command: START_REPLICATION SLOT "pg_basebackup_778654" 0/2000000 TIMELINE 1 2023-08-08 22:20:23.140 UTC [778654] 031_recovery_conflict.pl STATEMENT: START_REPLICATION SLOT "pg_basebackup_778654" 0/2000000 TIMELINE 1 2023-08-08 22:20:23.679 UTC [778644] 031_recovery_conflict.pl LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp778644.0", size 137324 2023-08-08 22:20:23.679 UTC [778644] 031_recovery_conflict.pl STATEMENT: BASE_BACKUP ( LABEL 'pg_basebackup base backup', PROGRESS, CHECKPOINT 'fast', WAIT 0, MANIFEST 'yes', TARGET 'client') 2023-08-08 22:20:24.216 UTC [778709] standby LOG: received replication command: IDENTIFY_SYSTEM 2023-08-08 22:20:24.216 UTC [778709] standby STATEMENT: IDENTIFY_SYSTEM 2023-08-08 22:20:24.216 UTC [778709] standby LOG: received replication command: START_REPLICATION 0/3000000 TIMELINE 1 2023-08-08 22:20:24.216 UTC [778709] standby STATEMENT: START_REPLICATION 0/3000000 TIMELINE 1 2023-08-08 22:20:24.377 UTC [778711] 031_recovery_conflict.pl LOG: statement: CREATE DATABASE test_db 2023-08-08 22:20:24.427 UTC [778722] 031_recovery_conflict.pl LOG: statement: CREATE TABLE test_recovery_conflict_table1(a int, b int); 2023-08-08 22:20:24.430 UTC [778722] 031_recovery_conflict.pl LOG: statement: INSERT INTO test_recovery_conflict_table1 SELECT i % 3, 0 FROM generate_series(1,20) i; 2023-08-08 22:20:24.442 UTC [778722] 031_recovery_conflict.pl LOG: statement: CREATE TABLE test_recovery_conflict_table2(a int, b int); 2023-08-08 22:20:24.469 UTC [778740] 031_recovery_conflict.pl LOG: statement: SELECT pg_current_wal_flush_lsn() 2023-08-08 22:20:24.479 UTC [778743] 031_recovery_conflict.pl LOG: statement: SELECT '0/342B100' <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name IN ('standby', 'walreceiver') 2023-08-08 22:20:24.675 UTC [778767] 031_recovery_conflict.pl LOG: statement: BEGIN; 2023-08-08 22:20:24.675 UTC [778767] 031_recovery_conflict.pl LOG: statement: INSERT INTO test_recovery_conflict_table1 VALUES (1,0); 2023-08-08 22:20:24.676 UTC [778767] 031_recovery_conflict.pl LOG: statement: ROLLBACK; 2023-08-08 22:20:24.676 UTC [778767] 031_recovery_conflict.pl LOG: statement: BEGIN; 2023-08-08 22:20:24.676 UTC [778767] 031_recovery_conflict.pl LOG: statement: LOCK test_recovery_conflict_table1; 2023-08-08 22:20:24.676 UTC [778767] 031_recovery_conflict.pl LOG: statement: COMMIT; 2023-08-08 22:20:24.692 UTC [778773] 031_recovery_conflict.pl LOG: statement: SELECT pg_current_wal_flush_lsn() 2023-08-08 22:20:24.710 UTC [778783] 031_recovery_conflict.pl LOG: statement: SELECT '0/342B1C0' <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name IN ('standby', 'walreceiver') 2023-08-08 22:20:25.051 UTC [778797] 031_recovery_conflict.pl LOG: statement: VACUUM test_recovery_conflict_table1; 2023-08-08 22:20:25.142 UTC [778883] 031_recovery_conflict.pl LOG: statement: SELECT pg_current_wal_flush_lsn() 2023-08-08 22:20:25.173 UTC [778886] 031_recovery_conflict.pl LOG: statement: SELECT '0/3430000' <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name IN ('standby', 'walreceiver') 2023-08-08 22:20:25.344 UTC [778912] 031_recovery_conflict.pl LOG: statement: SELECT '0/3430000' <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name IN ('standby', 'walreceiver') 2023-08-08 22:23:50.441 UTC [778626] LOG: received immediate shutdown request 2023-08-08 22:23:50.456 UTC [796024] [unknown] FATAL: the database system is shutting down 2023-08-08 22:23:50.955 UTC [796036] [unknown] FATAL: the database system is shutting down 2023-08-08 22:23:50.988 UTC [778626] LOG: database system is shut down
2023-08-08 22:20:24.198 UTC [778701] LOG: starting PostgreSQL 17devel (Debian 17~~devel-1) on s390x-ibm-linux-gnu, compiled by gcc (Debian 13.2.0-1) 13.2.0, 64-bit 2023-08-08 22:20:24.198 UTC [778701] LOG: listening on Unix socket "/tmp/S8kif7vda0/.s.PGSQL.59146" 2023-08-08 22:20:24.200 UTC [778707] LOG: database system was interrupted; last known up at 2023-08-08 22:20:23 UTC 2023-08-08 22:20:24.200 UTC [778707] LOG: entering standby mode 2023-08-08 22:20:24.201 UTC [778707] LOG: redo starts at 0/2000028 2023-08-08 22:20:24.201 UTC [778707] WARNING: unexpected directory entry "16384" found in pg_tblspc/ 2023-08-08 22:20:24.201 UTC [778707] DETAIL: All directory entries in pg_tblspc/ should be symbolic links. 2023-08-08 22:20:24.201 UTC [778707] HINT: Remove those directories, or set allow_in_place_tablespaces to ON transiently to let recovery complete. 2023-08-08 22:20:24.201 UTC [778707] LOG: consistent recovery state reached at 0/2000100 2023-08-08 22:20:24.204 UTC [778701] LOG: database system is ready to accept read-only connections 2023-08-08 22:20:24.216 UTC [778708] LOG: started streaming WAL from primary at 0/3000000 on timeline 1 2023-08-08 22:20:24.715 UTC [778760] 031_recovery_conflict.pl LOG: statement: BEGIN; 2023-08-08 22:20:24.720 UTC [778760] 031_recovery_conflict.pl LOG: statement: DECLARE test_recovery_conflict_cursor CURSOR FOR SELECT b FROM test_recovery_conflict_table1; 2023-08-08 22:20:24.726 UTC [778760] 031_recovery_conflict.pl LOG: statement: FETCH FORWARD FROM test_recovery_conflict_cursor; 2023-08-08 22:20:24.741 UTC [778760] 031_recovery_conflict.pl LOG: statement: ; 2023-08-08 22:23:50.442 UTC [778708] FATAL: could not receive data from WAL stream: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. 2023-08-08 22:23:50.442 UTC [778707] LOG: invalid record length at 0/347E6A8: expected at least 24, got 0 2023-08-08 22:23:50.456 UTC [796019] FATAL: could not connect to the primary server: connection to server on socket "/tmp/S8kif7vda0/.s.PGSQL.59145" failed: FATAL: the database system is shutting down 2023-08-08 22:23:50.490 UTC [778707] LOG: waiting for WAL to become available at 0/347E6C0 2023-08-08 22:23:50.957 UTC [796029] FATAL: could not connect to the primary server: connection to server on socket "/tmp/S8kif7vda0/.s.PGSQL.59145" failed: FATAL: the database system is shutting down 2023-08-08 22:23:50.957 UTC [778707] LOG: waiting for WAL to become available at 0/347E6C0 2023-08-08 22:23:51.095 UTC [778701] LOG: received immediate shutdown request 2023-08-08 22:23:51.098 UTC [778701] LOG: database system is shut down
# Checking port 59145 # Found port 59145 Name: primary Data directory: /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata Backup directory: /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/backup Archive directory: /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/archives Connection string: port=59145 host=/tmp/S8kif7vda0 Log file: /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/log/031_recovery_conflict_primary.log # Running: initdb -D /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata -A trust -N The files belonging to this database system will be owned by user "myon". This user must also own the server process. The database cluster will be initialized with this locale configuration: provider: libc LC_COLLATE: en_US.utf8 LC_CTYPE: en_US.utf8 LC_MESSAGES: C LC_MONETARY: en_US.utf8 LC_NUMERIC: en_US.utf8 LC_TIME: en_US.utf8 The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english". Data page checksums are disabled. creating directory /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata ... ok creating subdirectories ... ok selecting dynamic shared memory implementation ... posix selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting default time zone ... Etc/UTC creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... ok Sync to disk skipped. The data directory might become corrupt if the operating system crashes. Success. You can now start the database server using: pg_ctl -D /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata -l logfile start # Running: /home/myon/postgresql/pg/postgresql/build/src/test/recovery/../../../src/test/regress/pg_regress --config-auth /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata ### Starting node "primary" # Running: pg_ctl -w -D /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata -l /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/log/031_recovery_conflict_primary.log -o --cluster-name=primary start waiting for server to start.... done server started # Postmaster PID for node "primary" is 778626 # Taking pg_basebackup my_backup from node "primary" # Running: pg_basebackup -D /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/backup/my_backup -h /tmp/S8kif7vda0 -p 59145 --checkpoint fast --no-sync # Backup finished # Checking port 59146 # Found port 59146 Name: standby Data directory: /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_standby_data/pgdata Backup directory: /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_standby_data/backup Archive directory: /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_standby_data/archives Connection string: port=59146 host=/tmp/S8kif7vda0 Log file: /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/log/031_recovery_conflict_standby.log # Initializing node "standby" from backup "my_backup" of node "primary" ### Enabling streaming replication for node "standby" ### Starting node "standby" # Running: pg_ctl -w -D /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_standby_data/pgdata -l /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/log/031_recovery_conflict_standby.log -o --cluster-name=standby start waiting for server to start.... done server started # Postmaster PID for node "standby" is 778701 Waiting for replication conn standby's replay_lsn to pass 0/342B100 on primary done Waiting for replication conn standby's replay_lsn to pass 0/342B1C0 on primary done [22:20:24.714](3.145s) # issuing query via background psql: # BEGIN; # DECLARE test_recovery_conflict_cursor CURSOR FOR SELECT b FROM test_recovery_conflict_table1; # FETCH FORWARD FROM test_recovery_conflict_cursor; [22:20:24.745](0.031s) ok 1 - buffer pin conflict: cursor with conflicting pin established Waiting for replication conn standby's replay_lsn to pass 0/3430000 on primary done timed out waiting for match: (?^:User was holding shared buffer pin for too long) at t/031_recovery_conflict.pl line 318. # Postmaster PID for node "primary" is 778626 ### Stopping node "primary" using mode immediate # Running: pg_ctl -D /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_primary_data/pgdata -m immediate stop waiting for server to shut down.... done server stopped # No postmaster PID for node "primary" # Postmaster PID for node "standby" is 778701 ### Stopping node "standby" using mode immediate # Running: pg_ctl -D /home/myon/postgresql/pg/postgresql/build/src/test/recovery/tmp_check/t_031_recovery_conflict_standby_data/pgdata -m immediate stop waiting for server to shut down.... done server stopped # No postmaster PID for node "standby" [22:23:51.412](206.667s) # Tests were run but no plan was declared and done_testing() was not seen. [22:23:51.413](0.000s) # Looks like your test exited with 255 just after 1.