Re: Fixing WAL instability in various TAP tests

Mark Dilger Sat, 25 Sep 2021 08:20:26 -0700

> On Sep 25, 2021, at 7:17 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> 
>> Leaving the tests brittle wastes developer time.
> 
> Trying to make them proof against all possible settings would waste
> a lot more time, though.

You may be right, but the conversation about "all possible settings" was 
started by Noah.  I was really just talking about tests that depend on wal 
files not being removed, but taking no action to guarantee that, merely 
trusting that under default settings they won't be.  I can't square that design 
against other TAP tests that do take measures to prevent wal files being 
removed.  Why is the precaution taken in some tests but not others?  If this is 
intentional, shouldn't some comment in the tests without such precautions 
explain that choice?  Are they intentionally testing that the default GUC wal 
size settings and wal verbosity won't break the test?

This isn't a rhetorical question:

In src/test/recovery/t/015_promotion_pages.pl, the comments talk about the how 
checkpoints impact what happens on the standby.  The test issues an explicit 
checkpoint on the primary, and again later on the standby, so it is unclear if 
that's what the comments refer to, or if they also refer to implicit 
expectations about when/if other checkpoints will happen.  The test breaks when 
I change the GUC settings, but I can fix that breakage by adding a replication 
slot to the test.  Have I broken the purpose of the test by doing so, though?  
Does using a replication slot to force the wal to not be removed early break 
what the test is designed to check?

The other tests raise similar questions.  Is the brittleness intentional?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Re: Fixing WAL instability in various TAP tests

Reply via email to