Re: Stabilizing the test_decoding checks, take N

Dilip Kumar Mon, 18 Apr 2022 03:00:09 -0700

On Mon, Apr 18, 2022 at 11:19 AM Amit Kapila <amit.kapil...@gmail.com>
wrote:


> On Sat, Apr 16, 2022 at 10:42 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
> >
> > My pet dinosaur prairiedog just failed in the contrib/test_decoding
> > tests [1]:
> >
> > diff -U3
> /Users/buildfarm/bf-data/HEAD/pgsql.build/contrib/test_decoding/expected/stream.out
> /Users/buildfarm/bf-data/HEAD/pgsql.build/contrib/test_decoding/results/stream.out
> > ---
> /Users/buildfarm/bf-data/HEAD/pgsql.build/contrib/test_decoding/expected/stream.out
> 2022-04-15 07:59:17.000000000 -0400
> > +++
> /Users/buildfarm/bf-data/HEAD/pgsql.build/contrib/test_decoding/results/stream.out
> 2022-04-15 09:06:36.000000000 -0400
> > @@ -77,10 +77,12 @@
> >   streaming change for transaction
> >   streaming change for transaction
> >   streaming change for transaction
> > + closing a streamed block for transaction
> > + opening a streamed block for transaction
> >   streaming change for transaction
> >   closing a streamed block for transaction
> >   committing streamed transaction
> > -(13 rows)
> > +(15 rows)
> >
> > Looking at the postmaster log, it's obvious where this extra transaction
> > came from: auto-analyze ran on pg_type concurrently with the test step
> > just before this one.  That could only happen if the tests ran long
> enough
> > for autovacuum_naptime to elapse, but prairiedog is a pretty slow
> machine.
> > (And I hasten to point out that some other animals, such as those running
> > valgrind or CLOBBER_CACHE_ALWAYS, are even slower.)
> >
> > We've seen this sort of problem before [2], and attempted to fix it [3]
> > by making these tests ignore empty transactions.  But of course
> > auto-analyze's transaction wasn't empty, so that didn't help.
> >
>
> The possible reason here is that this extra (auto-analyze) transaction
> causes the logical decoding work mem to reach before the last change
> of the test's transaction. As can be seen from the logs, it just
> closed the stream before the last change and then opened a new stream
> for the last change. Now, it is true that the auto-analyze changes
> won't be decoded as they don't perform DML operation on any
> non-catalog table but it could generate some invalidation message
> which needs to be processed even though we won't send anything related
> to it to the downstream.
>

This analysis seems right to me.


> This needs to be verified once by doing some manual testing as it may
> not be easily reproducible every time. If this happens to be true then
> I think your suggestion related to increasing autovacuum_naptime would
> work.
>

I will try to reproduce this, maybe by reducing the autovacuum_naptime or
parallelly running some script that continuously performs DDL-only
transactions.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Stabilizing the test_decoding checks, take N

Reply via email to