Re: Fixing WAL instability in various TAP tests

Mark Dilger Tue, 28 Sep 2021 11:43:48 -0700

> On Sep 28, 2021, at 11:07 AM, Mark Dilger <mark.dil...@enterprisedb.com> 
> wrote:
> 
> Looking closer at the TAP test, it's not ORDERing the result set from the 
> SELECTs on either node, but it is comparing the sets for stringwise equality, 
> which is certainly order dependent.

Taking the output from the buildfarm page, parsing out the first test's results 
and comparing got vs. expected for this test:

is($primary_result, $standby_result, "$test_name: query result matches");

the primary result had all the same rows as the standby, along with additional 
rows.  Comparing the results, they match other than rows missing from the 
standby that are present on the primary.  That seems consistent with the view 
that the query on the standby is running before all the data has replicated 
across.

However, the missing rows all have column i either 0 or 3, though the test 
round-robins i=0..9 as it performs the inserts.  I would expect the wal for the 
inserts to not cluster around any particular value of i.  The DELETE and VACUUM 
commands do operate on a single value of i, so that would make sense if the 
data failed to be deleted on the standby after successfully being deleted on 
the primary, but then I'd expect the standby to have more rows, not fewer.

Perhaps having the bloom index messed up answers that, though.  I think it 
should be easy enough to get the path to the heap main table fork and the bloom 
main index fork for both the primary and standby and do a filesystem comparison 
as part of the wal test.  That would tell us if they differ, and also if the 
differences are limited to just one or the other.

I'll go write that up....


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Re: Fixing WAL instability in various TAP tests

Reply via email to