> On Sep 24, 2021, at 10:21 PM, Noah Misch <n...@leadboat.com> wrote:
>
>> I would
>> expect tests which fail under legal alternate GUC settings to be hardened to
>> explicitly set the GUCs as they need, rather than implicitly relying on the
>> defaults.
>
> That is not the general practice in PostgreSQL tests today. The buildfarm
> exercises some settings, so we keep the tests clean for those. Coping with
> max_wal_size=2 that way sounds reasonable. I'm undecided about the value of
> hardening tests against all possible settings.
Leaving the tests brittle wastes developer time.
I ran into this problem when I changed the storage underlying bloom indexes and
ran the contrib/bloom/t/001_wal.pl test with wal_consistency_checking=all.
That caused the test to fail with errors about missing wal files, and it took
time to backtrack and see that the test fails under this setting even before
applying my storage layer changes. Ordinarily, failures about missing wal
files would have led me to suspect the TAP test sooner, but since I had mucked
around with storage and wal it initially seemed plausible that my code changes
were the problem. The real problem is that a replication slot is not used in
the test.
The failure in src/test/recovery/t/015_promotion_pages.pl is also that a
replication slot should be used but is not.
The failure in src/bin/pg_basebackup/t/010_pg_basebackup.pl stems from not
heeding the documented requirement for pg_basebackup -X fetch that the
wal_keep_size "be set high enough that the required log data is not removed
before the end of the backup". It's just assuming that it will be, because
that tends to be true under default GUC settings. I think this can be fixed by
setting wal_keep_size=<SOMETHING_BIG_ENOUGH>, but (a) you say this is not the
general practice in PostgreSQL tests today, and (b) there doesn't seem to be
any principled way to decide what value would be big enough. Sure, we can use
something that is big enough in practice, and we'll probably have to go with
that, but it feels like we're just papering over the problem.
I'm inclined to guess that the problem in
src/bin/pg_basebackup/t/020_pg_receivewal.pl is similar.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company