Hello hackers,
A recent buildfarm failure [1] shows that the test 031_recovery_conflict.pl
can fail yet another way:
23/296 postgresql:recovery / recovery/031_recovery_conflict ERROR
11.55s exit status 1
[07:58:53.979](0.255s) ok 11 - tablespace conflict: logfile contains terminated
connection due to recovery conflict
[07:58:54.058](0.080s) not ok 12 - tablespace conflict: stats show conflict on
standby
[07:58:54.059](0.000s) # Failed test 'tablespace conflict: stats show
conflict on standby'
# at
/home/bf/bf-build/rorqual/REL_17_STABLE/pgsql/src/test/recovery/t/031_recovery_conflict.pl
line 332.
[07:58:54.059](0.000s) # got: '0'
# expected: '1'
I managed to reproduce a similar failure by running multiple test instances
in parallel on a slow VM. With extra logging added, I see in a failed run
log:
10
10 # Failed test 'startup deadlock: stats show conflict on standby'
10 # at t/031_recovery_conflict.pl line 368.
10 # got: '0'
10 # expected: '1'
10 [19:48:19] t/031_recovery_conflict.pl ..
10 Dubious, test returned 1 (wstat 256, 0x100)
2024-07-23 19:48:13.966 UTC [1668402:12][client backend][1/2:0] LOG: !!!pgstat_report_recovery_conflict| reason: 13,
pgstat_track_counts: 1 at character 15
2024-07-23 19:48:13.966 UTC [1668402:13][client backend][1/2:0] STATEMENT:
SELECT * FROM test_recovery_conflict_table2;
2024-07-23 19:48:13.966 UTC [1668402:14][client backend][1/2:0] ERROR: canceling statement due to conflict with
recovery at character 15
2024-07-23 19:48:13.966 UTC [1668402:15][client backend][1/2:0] DETAIL: User transaction caused buffer deadlock with
recovery.
...
2024-07-23 19:48:14.129 UTC [1668805:8][client backend][5/2:0] LOG: statement: SELECT confl_deadlock FROM
pg_stat_database_conflicts WHERE datname='test_db';
...
2024-07-23 19:48:14.148 UTC [1668402:16][client backend][1/0:0] LOG:
!!!pgstat_database_flush_cb| nowait: 0
This failure can be reproduced easily with a sleep added as below:
@@ -514,6 +514,7 @@ pgstat_shutdown_hook(int code, Datum arg)
if (OidIsValid(MyDatabaseId))
pgstat_report_disconnect(MyDatabaseId);
+if (rand() % 5 == 0) pg_usleep(100000);
pgstat_report_stat(true);
By running the test in a loop, I get miscellaneous
"stats show conflict on standby" failures, including:
iteration 19
# +++ tap check in src/test/recovery +++
t/031_recovery_conflict.pl .. 1/?
# Failed test 'buffer pin conflict: stats show conflict on standby'
# at t/031_recovery_conflict.pl line 332.
# got: '0'
# expected: '1'
t/031_recovery_conflict.pl .. 17/? # Looks like you failed 1 test of 18.
[1]
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rorqual&dt=2024-07-23%2007%3A56%3A35
Best regards,
Alexander